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06562449 

EDITING SYSTEM AND METHOD USED FOR TRANSCRIPTION OF TELEPHONE MESSAGE 

PUB. NO. : 20-00148182 [JP 2000148182 A] 

PUBLISHED: May 26, 2000 (20000526) 
INVENTOR ( s ) : MUKUNDO PADOMANABUHAN 
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G10L-015/22; G06F-017/28; G10L-013/00; G10L-015/00; 
H04M-003/42 



APPLICANT (s) 
APPL. NO. : 
FILED: 
PRIORITY: 

INTL CLASS: 



PROBLEM TO BE SOLVED: 



ABSTRACT 

To correct a transcribed text with a voice by 
regenerating a synthesized speech, making a user correct the synthesized 

voice, and transmitting the corrected voice as a text throuqh 
communication system. y 



SOLUTION: A telephone server 26 transfers a text and a diagnosis to a 
speech synthesizing server 34. The speech synthesizing server 34 

creates a synthesized speech and returns this synthesized speech to the 
telephone server 26. The telephone server 26 regenerates the 

synthesized speech to a user through telephone lines. One purpose of 
regenerating the synthesized speech to the user is to allow the user to 
correct an unacceptable or inaccurate region. The telephone server 2 6 
provides the user with an option of correcting a message. The regeneration 
V01 ^ e related to a correcting mechanism 36 is achieved in many 
methods. When the user satisfies the transcription, the telephone server 

26 transmits the text. together with a recorded voice to a message server 
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ABSTRACT 

... SD) . On the side of information receivers 6 and 7, the text information 
is separated from the intermediate language information and displayed out, 
voices are synthesized while using the intermediate language information, 
and that synthetic voice information is outputted. Namely, as the 
intermediate language information, text data for voice synthesization 

in voice synthesizing processing are analyzed and information made into 
prescribed data format is transmitted from the server side (information 
transmitters) to the terminal equipment side (information receivers) . 

COPYRIGHT: (C) 1999, JPO 
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ABSTRACT 

...BE SOLVED: To provide a voice browser system which enables even a 
visually handicapped .person to acquire the WWW information. 



SOLUTION: This system includes a server 100 that has a voice request 
acquisition , means 101 which acquires a request from a client 200 via the 
input of voices, a voice recognition... 



. . . which transmits a request to the URL that is designated by the client 
200 based on the recognition result of the means 102 to an internet 70, a 
voice data generation means 104 which extracts a read-aloud text from the 
answer given from the internet 70 and converts the text into the voice 

data to synthesize the voices and a voice data transmission means 105 
which transmits the voice data generated by the means 104 to the client 
200. The system. . . 

. . . which inputs the requests given from the users in voices, a request 
issue means 202 which extracts the URL from the result acquired from the 

server 100 and gives a request of an HTML file to the server 100 based 
on the extracted URL and a voice output means 203 which outputs the voice 
data received from the server 100. 
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1/12 - (C) IBM CORP 1993 
AN - NN9501527 

TI - Techniques for Modifying Prosodic Information in a Text-to-Speech 
System 

PUB - IBM Technical Disclosure Bulletin, January 1995, US 
VOL - 38 
NR - 1 

PG - 527 - 528 

TXT - Disclosed is a technique for modifying prosodic information in 
a text-to-speech synthesis system by using a sample 
of speech. When 

the generated prosody of the text-to-speech system 
needs to be 

modified, it is very difficult to teach the system correct prosody. 
By analyzing a sample of speech, such prosodic information as 
phonetic duration, pitch pattern, and stress pattern can 
be estimated 

automatically, and these prosodic parameters are used instead of the 
generated prosody. They are also used to retrain the prosodic models 
of the text-to-speech synthesis system. 

Phonetic durations are estimated by using phonetic Hidden 
Markov Models (HMMs) for continuous speech recognition. Since the 
spoken text is known, the sequence of the phonetic HMMs of the spoken 
text is aligned with the speech sample by using the Viterbi 
algorithm. On the basis of the alignment, each phonetic 
duration is 

estimated. On the other hand, the pitch patterns are estimated 
by 

using a conventional pitch detector, modified to keep them 
within the 

original speaker's range. The stress patterns are also calculated 
from the raw power for each frame . 

When these three sets of parameters of the 
text - to- speech 

synthesis system are replaced with those extracted from the 
speech 

sample, the prosody of the synthesized speech becomes very 
natural . 

2/12 - (C) IBM CORP 1993 
AN - NB9309235 

TI - Voice Activated Music System 

PUB - IBM Technical Disclosure Bulletin, September 1993, US 

VOL - 36 

NR - 9B 

PG - 235 - 236 

TXT - Disclosed is an approach to a voice I/O system for music or 

multimedia applications in which musical parameters are automatically 
activated based on voice command input. 

Current computer-based musical systems are based on user 
interaction with MIDI I/O capability (e.g., musical keyboard, PC 
keyboard, or music synthesizer module) . This process assumes that a 
musician encodes all necessary musical performance information to be 
interpreted and processed (i.e., the burden is placed on the 
musician) . This process is slow since all detail of the musical 
performance must be manually entered and cumbersome since a computer 
pointing device (e.g., mouse) must be used. 

The approach taken in this disclosure assumes less burden for 



the musician (or end user who is not a musician) since he no longer 
inputs musical sequences by "hand"; rather, voice command input yield 
computer-generated (automated) sequences, which "fill in" or modify 
desired musical parameters such as pitch, chord sequence, 
instrumentation, style, etc. 

A System environment for a voice activated system would consist 
of the following. Voice Input: A voice recognition based on 
utterances discrete continuous (word or phrase) recognition based on 
utterances (i.e., lexical). Voice Output: A speech synthesizer 
(text-to-speech), which outputs musical parameters generated (i.e., 
speech output of musical parameters and not the music generated) . 
Music Output: automated comuter generated music via MIDI. Session: 
Musician sits in front of system, inputs via voice the musical 
elements desired; system outputs musical (generated) parameters and 
synthesized speech (output) . 

The following illustrates an example session (discrete or 
continuous), which is in structured-English format. For 
discrete/continuous utterance (isolated or non-isolated words) Do: 

1. Match pattern of pre-stored voice template for voice utterance. 

2. Execute Matched Pattern: 

If pattern found, automatically generate an output for musical 

attributes desired . 

Else output speech error message. 

3. Output as synthesized speech the musical parameters selected. 

The following pseudo-code illustrates an example how pitch 

would be determined. 
/* pitch is determined as root of the chord sequence; note*/ 
/* N = some upper limit, and 48 = c (below middle C) , 50 = D, etc. 

*/ 

int pitchl N 

60, 50, 55, 48, ... ; 
int pitch2 lbrc.N 

50, 55, 48, 55, ... ; 

int pitchl N 

55, 60, 48, 48, ... ; 
dur=bound; 
MainO 



call Generate_chords (voice_input ) ; 



/* end of main */ 

Procedure Generate_chords (voice^input) ; 

/* compute chordal parameters */ 
char string voice_input; 

if (voice_input == "seventh chord first inversion") 
for (m=0 ;m<bound;m++) 

fputs(" Chord-Note (%d, %d) ; \n" , pitchl m + 4,dur); 
fputs(" Chord-Note (%d, %d) ; \n" , pitchl m + 7,dur); 
fputs (Chord -Note (%d, %d) ; \n" , pitchl m + 10,dur); 
fputs (Chord -Note (%d, %d) ; \n" , pitchl m ,dur); 
fputs (Chord-Note (%d, %d) ; \n" , pitchl m + 12,dur; 

*rbrc. \* end of Generate_chords ( ) */ 

The above approach is a new and novel technoque for automatic 
generation of music in a computer-based environment. Additionally, 



current (and proposed) future music systems do not rely on 
voice-activated, end-user, response. Future music (enhanced) audio 
systems will pursue this technology, and make it pervasive across 
product offerings (e.g., Yamaha). 

3/12 - (C) IBM CORP 1993 
AN - NN92016 

TI - Rule Based Speech Synthesis Method Using a Residual Codebook. 

PUB - IBM Technical Disclosure Bulletin, January 1992, US 

VOL - 34 

NR - 8 

PG - 6 - 9 

TXT - - A method to synthesize natural -sounding speech for 

unlimited-vocabulary text by using an effectively-compressed residual 
source codebook is proposed here. 

BACKGROUND: In speech synthesis by rule which the 
LPC (Linear 

Predictive Coding) speech analysis/synthesis technique is 
applied to, 

the use of LPC residual signal is one of key issues to improve the 
quality of synthetic speech (1-4) . There are two substantial problems 
left unsolved in applying LPC residual signal to rule-based LPC 
speech synthesis as follows. 

(1) Quality degradation according to the pitch modification 

In most rule- synthesis methods, a number of speech 
synthesis 

units (usually, several hundreds of units) are extracted from actual 

speech samples. To use these units for generating speech 

of 

arbitrary texts, which are different from the sample texts, the 
original pitch of speech synthesis units should be 
modified to 

coincide with the pitch contour of new texts. The spectral 
distortion caused by the pitch modification degrades the quality 
of 

synthetic speech. This type of quality degradation is more 
considerable in a residual -excited synthesizer than in a 
pulse/noise-excited synthesizer, because the residual signal is 
fairly sensitive to the original pitch frequency whereas the 
pulse 

signal has nothing to do with it. 

(2) Sizable data of LPC residual signals for speech 
synthesis units 

The LPC residual signal is defined as the prediction residue of 
LPC analysis. To use the original residual data for all the 
speech 

synthesis units causes a problem in implementing practical 
speech 

synthesis systems, such as a Digital Signal Processor (DSP) -based 
system which does not usually have sufficient data memory area to 
store the residual sources. 

This proposal focuses mainly on problem (2) above, and the 
result of it, conquers problem (1) in a sense. Our experimental 
system has about 360 speech synthesis units. The data size for 
spectral data (the basic part of unit data) is 80 KB. On the other 
hand, the residual data size is 480 KB. To improve the speech 
quality, many more units should be accumulated for reflecting minute 
contextual effects on the synthetic speech. Therefore, the problem 
becomes more critical for the quality improvement because the 
residual data size increases in proportion to the number of units. 



PROPOSED METHOD: Creation of a Codebook for Voiced Residual 
Signals 

Given this background, we propose here a method to create an 
effectively compressed residual source codebook without degrading the 
quality of synthetic speech. There are two kinds of residual 
signals: the voiced and the voiceless. The voiced residual signals 
occupy 70-80% of the whole residual data. This proposal is related 
only with the massive part of the residual signals, i.e., the voiced 
residual. As for the residual signals for voiceless speech, we use 
here the original signals. By the proposed method, the residual 
signals for voiced speech are compressed down to about one twentith 
(1/20) without degrading the quality of synthetic speech as a 
residual codebook, which is created by the procedure described below. 
1) Extraction of 1-pitch residual signal 

1-pitch residual signals are extracted by observation for 

each 

frame data (4738 voiced frames in total) of all the synthesis units. 

(2) Clustering 

By clustering the spectral data of residual signals using a 
kind of clustering methods (the LBG method (5) is used here) , a 
residual codebook which has 256 centroids is created. The number of 
centroids of the residual codebook should be determined 
experimentally in consideration of the trade-off between the codebook 
size and the quality of the synthesized speech. By a 
preliminary 

experiment, we selected 256 as the codebook size. 

(3) Conversion of residual spectra to zero-phased waveforms 

To use the codebook as the exciting source of an LPC 
synthesizer, the spectral centroids should be converted to 
waveforms . 

To compress further the waveform data without degrading the quality 
of synthesized speech, we adopt- here the zero-phasing 
technique . The 

zero-phased waveforms of the codebook spectra are calculated by 
applying inverse FFT to the spectra with the phase parts set to zero. 
Since the zero- phased waveform is symmetrical and its energy 
concentrates on the zero-point, it is very effective on the 
compression and robust to the pitch modification (problem (1) ) in 
comparison with the residual signal. Moreover, the quality of the 
synthetic speech turns out to be fairly good and stable, mainly 
because the resultant codebook represents well the whole spectral 
space of the voiced residual signals. 

(4) Synthesis using the codebook 

A residual code number is attached to each voiced frame. The 
zero-phased waveform centroid which corresponds to the code number is 
read out from the codebook and used as the exciting source of the 
synthesizer . 

EFFECT OF THE PROPOSED METHOD: The proposed method has the 
following good effects, which have already been confirmed by 
experiments using our PC-based text- to- speech system. This method 
is very effective especially in the practical implementation of a 
high-quality text-to-speech system. 
(1) Stable quality 

It is very natural that the speech quality obtained by this 
method is much better than that of a pulse/noise-excited synthesizer. 
It is, also, as good as that of a synthesizer which uses the original 
(not compressed) residual signals. In general, it is said that the 
synthesizer using the original residual has the roughness in its 
speech quality, because it is difficult to absorb the local 
fluctuation of residual signals. On the other hand, the zero-phasing 



has a good effect on the overall quality which can make it 
homogeneous and stable because of the robustness in the pitch 
modification. Moreover, since the codebook represents well the whole 
spectral space of the voiced residual signals, the homogeneous 
quality is not so far from that of a synthesizer using the original 
residual signals. These are the reasons why the speech quality of 
the proposed synthesizer is as good as that of a synthesizer using 
the original residual signals. 
(2) High data compression rate 

Not only the stable quality but also a high data compression 
rate can be obtained by this method. For instance, 1/20 is the 
compression rate of our experimental system. 
The original residual data size; 

4738 (frames) x 80 (points) x 1 (byte) = 379.04 (KB) 
4738 (frames) : number of voiced frames 

80 (points) : average number of points per 1-pitch 
residual signal 
1 (byte) : data size per point 
The data size compressed by this method; 

256 (centroids) x 64 (points) x 1 (byte) = 16.384 (KB) 
256 (frames) : number of centroids 
64 (points) : number of points per l/4-pitch 

zero-phased residual signal 
1 (byte) : data size per point 
The compression rate; 
( 16.384 / 379.04 ) x 100 = 4.3 (%) => less than 1/20 
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4/12 - (C) IBM CORP 1993 
AN - NN9111206 

TI - Flexible High-Quality Audio Delivery Via Infrared Link. 
PUB - IBM Technical Disclosure Bulletin, November 1991, US 
VOL - 34 
NR - 6 

PG - 206 - 208 

TXT - - Disclosed are flexible, modular approaches to providing 

high-quality stereo audio to the personal computer or workstation 
user. Included is a description of a novel PC-speaker in a chair 
implementation. 

As multimedia systems become more prevalent, full -range audio 
signals will replace simple "beeps" as audio output from personal 
computer systems. The traditional audio speaker location within the 
computer system unit will quickly be found inappropriate for delivery 
of this type of audio. Adding larger, higher-quality speakers to the 
computer system is one approach; however, the volume required to 



assure that adequate sound reaches the user will cause a nuisance 
factor in the home or office environment. Obviously, alternate 
methods for delivering high-quality audio to the computer user are 
needed. 

Depending on the environment in which the computer system is 
used, various approaches to delivering this sound may be appropriate. 
For example, high-fidelity amplified speakers would be appropriate 
where the computer is being used to present information to several 
people in a meeting room, while use of such speakers in a typical 
office environment would be intrusive, and headphones might be more 
appropriate. Because the use environment for a particular system 
cannot be predicted in advance, a flexible approach to delivering 
audio to the user is required. 

The approach disclosed here allows the use of a multitude of 
delivery systems, including those mentioned above, all of which 
receive audio information modulated by an infrared (IR) signal. 
Versions of this approach may be built into a computer system or 
added to an existing system. 

The basic approach used in all cases disclosed here is one in 
which the audio signal from the host computer is modulated onto an IR 
carrier. The IR sending module is located in a convenient place, 
such as atop the keyboard or CRT. A variety of devices receive the 
IR signal and demodulate it back to audio. 

Two implementations of the sender unit are disclosed. In the 
first case, it is assumed that the host system has been designed with 
existing audio-out jacks. Here a standalone modulator unit is used. 
A plug may be inserted into the audio source jack of the host 
computer. A cable brings the audio information into the unit body. 
Within the body is modulating circuitry and an IR output diode. A 
lens diffuses the IR output omni -direct ionally. The body may be 
mounted on top of a CRT, keyboard, or system unit, as appropriate for 
the particular system set-up. Adapter plugs are provided to allow 
simple adaptation to many jack styles, including mini-phone, phone, 
and RCA cables. The circuitry to implement such a sending unit is 
commercially available. The circuitry may be powered by a battery or 
AC adapter, or provisions may be made to allow a DC voltage to be 
provided by the host computer and transmitted via the connecting 
cable . 

The function and circuitry for the second embodiment of the 
sender unit is identical, with the difference being that the entire 
system is integrated into the computer system, rather than being an 
add-on feature. This approach has certain advantages, as the cables 
can be integrated into existing keyboard or CRT cables, and the diode 
lens can be integrated into the hardware in an attractive way. 

Different receiving unit types are proposed. In each case, the 
receiving unit contains an IR detector and demodulation circuit. 
Again, this circuitry is commercially available. Each type of 
receiving unit is described briefly in the following. In one 
embodiment, amplified speakers are used which snap onto the right and 
left side of the keyboard. The receiving circuitry drives a 
low-power amplifier contained within small speaker units. The 
speaker units are designed to match the cosmetics of the keyboard and 
other personal computer hardware, and contain full -range speakers up 
to 5 inches in diameter. Speakers this size can typically reproduce 
audio in the range of 100 Hz - 12 KHz. This range covers the vast 
majority of human hearing. Since the user is, by definition, within 
arm's length of the keyboard and speakers, low audio levels can be 
used to provide sufficient volume to the user, while interference and 
distraction to others is minimized. A peg-and-hole arrangement is 
one of many physical design techniques which could be employed to 



allow the speakers to be physically locked to the keyboard. 
Alternatively, the speakers may be placed in any convenient location 
in the vicinity of the keyboard. A related embodiment uses speakers 
separated by long wires from the keyboard. 

In another embodiment which involves a PA/ audio system input, a 
unit is proposed which simply translates the received IR signal back 
into an audio line- level signal and outputs it to RCA plugs, which 
may be used to connect to an existing PA or Hi-Fi audio system. This 
approach may be most appropriate for home use, as it minimizes the 
investment by using equipment already in place. Additionally, this is 
a useful approach for auditoriums or when the host computer is being 
used to drive an audio/visual presentation to a large group. 

In a chair/integrated speakers embodiment, an ergonomically 
designed high-back chair contains small high-quality audio speakers 
mounted inside the chair back, one just behind each of the user's 
ears. Again, the audio signal is transmitted via infrared link from 
the host PC to the chair. The chair body contains circuitry which 
receives and amplifies the signal, and presents it to the user via 
the speakers . A volume control may be located in any convenient 
spot, perhaps hidden in the armrest. The top sections of the chair 
back adjust to fine-tune the location of the speakers to fit the 
individual user. The chair is a novel approach to delivering 
high-fidelity sound to the computer user. Because of the close 
proximity of the speakers to the user, very low volume levels are 
sufficient, and so there is little possibility of distraction to 
those nearby. However, the disadvantages of headphones (fatigue, 
lack of comfort, inability to use the phone, easily lost, can shut 
out external sounds, sanitation) are avoided. Note that the 
headphone -like quality of the speaker chair allows use of binaural 
sound sources and other psychoacoustical effects for 
three-dimensional or surround- sound effects without other special 
equipment. This type of chair would be especially useful in 
productivity centers, or multi -media learning labs where many users 
may be working in close proximity. In related embodiments, the 
speakers may be snapped on to existing chairs in a user's office, and 
a circuit can be used which turns off the audio when no one is 
sitting in the chair. 

Standard headphones with IR receiving circuitry could also be 
used to receive high-quality audio from PCs. 

Note the fact that the delivery of remote audio may have 
particular value in areas where the computer system unit (and 
integrated speaker) are inaccessible or behind a wall, for example, 
in museum displays, shopping malls, and public information centers. 
In addition, the remote wireless delivery of audio information may be 
useful in schools. The ideas disclosed need not replace the 
traditional speaker; in fact, they can be used in conjunction with a 
system speaker in the following way. Audio intended for a single user 
can still be broadcast over the local computer system-unit speaker, 
while general information or emergency information can be transmitted 
from the computer to a remote audio delivery system, as described. 
This invention also applies to the remote wireless delivery of other 
information relating to audio, such as MIDI signals for musical 
instruments, and speech synthesizer control 
parameters . 

5/12 - (C) IBM CORP 1993 
AN - NB8911390 

TI - Pause Duration Control for Japanese Text-To-Speech System 
PUB - IBM Technical Disclosure Bulletin, November 1989, US 
VOL - 32 
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TXT - - This article describes a method for controlling pause 

duration in spoken sentences synthesized by a 
text-to-speech system. 

This method is based on analysis of spoken sentences and can produce 
natural pauses . 

In Japanese, pause duration is very important for communicating 
the syntactic and semantic structure of the sentence. Consequently 
pause duration control is a key to synthesizing natural -sounding 
spoken sentences . 

Conventional methods: There are two typical methods for 
controlling pause duration. The first method is based solely on 
punctuation marks; pause duration P is given by: 
P0 (No punctuation mark) 

P = 

PI (Punctuation mark) , 
where P0 and PI are constant values of pause duration. 
***** SEE ORIGINAL DOCUMENT ***** 

The other method controls pause duration solely by using 
breath- group length before the pause (LI) ; pause duration P is 
given by: 

P = a' + b'* LI, 

where a' and b' are parameters given by regression 
analysis . 

However, pause durations generated by using these methods bear 
little relation to those of natural utterance data. Pause durations 
assigned by these rules do not contribute to the naturalness of the 
synthesized speech. 

New method: In this new control method, pause duration is 
calculated by the length of both the breath-groups before and after 
the pause (LI, L2) ; pause duration P is given by P = a" + b M ( 
LI + c n * L2 ), where a", b" and c" are parameters given by regression 
analysis . 

Fig. 2 shows the relation between pause duration and the length 
of the breath-groups both before and after the pause. Pause duration 
data are dotted around a straight line, and the correlation of the 
regression line (R) is very high. 

This new method of controlling pause duration contributes 

to 

the naturalness and intelligibility of the synthesized 
speech. 

6/12 - (C) IBM CORP 1993 
AN - NN8710208 

TI - Mechanism for Integrating VOICE and DATA on a Transmission Channel 
PUB - IBM Technical Disclosure Bulletin, October 1987, US 
VOL - 30 
NR - 5 

PG - 208 - 209 

TXT - - A way to integrate voice and data information on the same 

transmission channel consists in reducing the rate needed for voice 
transport by means of voice-compression techniques. These techniques 
are sophisticated, and voice-compression equipment is needed at both 
ends of the channel. This article relates to a simple mechanism 
allowing voice and data to be transmitted on the same channel without 
using voice- compression techniques. Even if no compression technique 
is used, voice signal can be carried together with data information, 
due to the fact that voice signal includes no-activity periods. Such 
periods correspond to a voice level lower than a predetermined 



threshold. Such no-activity periods are long compared to a slot 
duration which is generally of 125 microseconds. A voice activity 
detector VAD detects inactivity with an integration of the voice 
signal over several slots. In this environment a difficulty arises to 
delimitate the voice and data information since non-compressed voice 
slots must not be permanently altered and it is not possible to use 
one bit out of the n bits in the slot to indicate whether this slot 
carries voice or data. The main idea is to use an HDLC (High Level 
Data Link Control) flag F as a delimiter, and a slot handler insures 
that the voice slots never simulate a flag. It is to be noted that 
the zero- insertion techniques cannot be used inside the voice stream, 
as it is necessary to keep all voice slots at slot boundaries. The 
slot handler detects voice slot values corresponding to flags F, and 
alters them to avoid flag simulations by changing the flag pattern 
01111110 into 01111111; the low-order bit is changed from O to 1 . 
This does not cause any significant degradation of the voice quality. 
Once the flag simulations have been eliminated from the voice 
information, it is possible to use flag F to indicate the beginning 
and the end of a no-voice activity period of time that will be used 
to carry data information on the same channel. 

/ VOICE / F 

/ DATA / F / VOICE J 

No voice activity The zero-insertion/deletion 
technique applies to the data stream, to avoid false flag simulations 
during data periods. If data stream corresponds to an HDLC transfer, 
zero insertion applies to all data, including the message flags. The 
voice activity detector VAD detects the no-voice activity periods 
during which data can be transmitted and handled in a conventional 
way by means of zero- insertion circuit and flag generator. The 
merged 

voice and data stream is transmitted on channel CH. At the end 
of a 

voice activity period, a flag is generated at a voice slot 
boundary 

to indicate that next bytes are voice bytes. This means that the data 
portion is a multiple of the voice slot duration, but 
corresponds to 

any number of data bits due to the zero insertion. If a zero is to 
be inserted when a flag must be generated, the first bit of the flag 
is considered as the inserted zero. Consequently, the voice slots do 
not suffer any delay distortion; they are delivered as if they used 
the whole channel for themselves. This allows the receiving end to 
use a normal decoding circuit. Data portions correspond to voice idle 
periods which are distinguished at the receiving end as the period 
delimited by flags during which the receiver generates a permanent 
idle signal. 

7/12 - (C) IBM CORP 1993 
AN - NN86123055 

TI - Constructing Method for Speech Synthesis Units 

PUB - IBM Technical Disclosure Bulletin, December 1986, US 

VOL - 29 
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TXT - - A segmentation and smoothing method is proposed to build 

smoothly connectable speech synthesis units from human 
utterances . 

Background Diphone, as a speech synthesis unit *, enables 
smooth 



connection and sophisticated duration control. However, it is 
difficult to build a diphone which works in various phonetic 
environments. Some phonemes are strongly co-articulated or need 
allophones to keep intelligibility and naturalness. Also, in 
commbining synthesis units to synthesize a word, sentence or 
text, 

smoothing is required to avoid a perceptual discontinuity between 
connected frames caused by changeable vocal effort. VCV (Vowel 
Consonant Vowel) - based diphone In this proposal, diphones are 
adapted to include co-articulations or allophonic features by 
additional entries for specific phonetic ***** SEE ORIGINAL DOCUMENT 
***** environment. In a mora-based phonetic system such as Japanese, 
these problems are solved by extracting parameters from VCV segments 
without losing freedom of duration control. In a VCV-based diphone 
set, only a pair of (VIC) and (CV2) diphones from the same V1CV2 
segment can be connected with each other at the consonant portion. 
Note that, for example, of 5 Japanese vowels /a,e,i,o,u/ and 
consonant /r/, 5 different kinds of (ar) diphone must be prepared for 
each succeeding vowel, and 5 kinds of (ra) diphone must be prepared 
for each preceding vowel. Fig. 1 shows the example of proposed 
segmentation. In Fig. 1, points a, b, and c are determined by 
spectral features and signal power. When a consonant is continuant, 
redundant frames around point b are omitted. Normalization and 
smoothing of parameters To eliminate perceptual discontinuity, 
synthesis parameters, such as amplitude and formant frequencies, 
should be identical to those of neighboring diphones at the 
connecting point. Proposed here is a simple method to smooth 
synthesis parameters. Series of raw parameter values extracted from 
human speech are: 1) normalized to a unique value which is 
determined previously at 

the vowel end- frame, and 2) smoothed according to linear 
interpolation at the other 

frames. Smoothing is performed within VCV, and then it is 
split into two diphones to prevent modifying transition 
unnecessarily. Fig. 2 shows the process of smoothing. In Fig. 2, 
one of formant frequencies is smoothed, which is identical to "normal 
value" fvl and fv2 at both end- frames, respectively, and can be 
connected to the preceding (-V1) diphone and the succeeding (V2-) 
diphone wihtout discontinuity. Reference N. R. Dixon and H. D. Maxey, 
"Terminal Analog 

Synthesis of Continuous Speech Using the Diphone Method 
of Segment Assembly," Trans . IEEE, AV-16, 40-50 
(March, 1968) . 

8/12 - (C) IBM CORP 1993 
AN - NN86055462 

TI - Generation of Nasalized Vowels in Text-To-Speech Synthesis 
PUB - IBM Technical Disclosure Bulletin, May 1986, US 
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TXT - - The present method involves synthesizing the nasalization 

of 

vowels between consonants in a speech synthesis 
environment . 

Briefly, (a) primary speech units --such as diphones-- which are 
concatenated to form words are scanned for the presence of a nasal 
consonant, (b) a look-ahead is performed to detect the presence of a 
second nasal consonant, and (c) if a second nasal consonant is 
detected, a nasal branch of the synthesizer is turned on for the 



duration of the intervening vowel. In describing the method in 
further detail, it is observed that for most phoneme or diphone 
formant synthesizers, there are 10 to 4 0 control 
parameters guiding 

the synthesizer in producing a speech waveform. These parameters 
change through time; the entire time ensemble for each 
parameter-class is often referred to as a "channel 11 . One common 
parameter is called AN (amplitude of nasality) . By way of an example, 
let ANi be the amplitude of nasalization as a function of time for a 
synthesized speech utterance. AN = 0 would imply no nasalization. 
First the detection of the presence of steady- state nasalization at 
a particular time point, i, must take place to trigger the algorithm: 
ANi > 0 and ANi+1 = ANi i = 1,2,3... 1 If the 

above condition (Eq. 1) is true for a particular i, then i is 
saved, and a search is conducted for afuture region of steady- state 
nasalization from i + tl to i+ t2 (for example, tl = 5 ms and 
t2 3 0 ms) Too long a future search (large t2) would lead to 
unwarranted nasalization. The search may be easily performed by 
searching the future AN's for a value equal to the current detected 
steady-state value at i: ANj = ANi j = (i + tl) , (i + tl) + 

1, ...i + t2 2 tl is needed to preclude the current nasal 

consonant from the search. If Eq. 2 is true for a particular j, 
then the intervening vowel sound is nasalized by turning the nasal 
synthesizer branch on up to time point j: ANk = ANi k = i,i + 
l,i + 2,...j 3 Eq. 3 is only implemented if 

there is no time between the surrounding nasal consonants during 
which voicing is interrupted (i.e., AO / 0, where AO is the amplitude 
of voicing) . Since nasalized vowels are constructed algorithmically 
by this approach, it is not necessary to store diphones containing 
these sounds, and the size of the library of stored sounds is not 
increased as a result. The above -out lined method is illustrated by 
the following example: Input example text: "man" Phonemic 
transcription: MX AE NX Diphone transcription: MXAE AENX Result 
generated from algorithm of method: MXAEn AEnNX (n indicates 
nasalization) . 

9/12 - (C) IBM CORP 1993 
AN - NN86055427 

Tl - Generation of "H" Sounds in Text-To-Speech Synthesis 
PUB - IBM Technical Disclosure Bulletin, May 1986, US 
VOL - 28 
NR - 12 

PG - 5427 - 5428 

TXT - - The present invention relates to a method for producing 

high-quality "H" sounds in a speech synthesizer. Because many speech 
synthesis systems construct utterances from a database of stored 
steady-state sounds (phonemes) , or transitions between steady-states 
(diphones) , it is necessary to have a steady-state description for 
each sound. However, the /h/ sound is so influenced by the 
characteristics of its surrounding sounds that it cannot be defined 
and stored as a steady-state phonemic unit on its own. Similarly, in 
the case of diphone synthesis, this chameleon effect makes it 
impossible to define transitions to a generic steady-state "H" . A 
method for producing high-quality "H" sounds using diphones as 
primary units is now described, and the same underlying principle 
could be applied to phoneme synthesis as well. In brief, the input 
string of diphones is scanned for the presence of the "H" sound. 
When found, the proceeding sound is tapered to silence, a transition 
state is constructed from an already existing unit, and the following 
sound is started with a gradual onset from silence. The method can be 



defined more rigorously and more generally in terms of the following 
diphone notation. Each diphone is represented as a pair- transition 
p(n):p(n+l), n = 1,3,5,... The string of diphones making up an 
utterance is scanned until p (n+1) = ,, HX ,i , at which point new diphones 
are inserted. By way of example, suppose there are two pair- 
transitions characterized as: p(l):p(2) = EEHX and p(3) :p(4) =HXEH . 
Given the detected diphones containing the H sounds, ***** SEE 
ORIGINAL DOCUMENT ***** where p(2) = "HX", the following 
transformation from two diphones to three diphones is applied: ***** 
SEE ORIGINAL DOCUMENT ***** XX indicates silence, and p(n) :XX 
therefore indicates gradual tapering of the p(n) sound to silence. 
"Ah", "asp", and "AO" are typical control-data parameters 
(and 

notation) for speech synthesizers. "Ah" is the amplitude 
of the hiss 

source (random number driving function) . "Asp" is a bit that 
indicates aspiration (the noise source directed through the formant 
chain) . "AO" is the amplitude of voicing. In other words, a 
transition p(l) :p(4) is constructed using a pre-existing diphone with 
subsequent modification (application of a low level of aspiration 
during the smooth transition to obtain a natural "H" sound) . During 
p(l) :p(4) any voicing or nasalization in the original diphone is 
turned off (A0=0, An=0) . Since all "H" -sounds are constructed 
algorithmically by this approach, it is not necessary to store 
diphones containing these sounds, and the size of the library of 
stored sounds is therefore decreased. 

10/12 - (C) IBM CORP 1993 
AN - NN85081248 

TI - Use of the Grid Search Technique for Improving Synthetic Speech 
Control -Data 

PUB - IBM Technical Disclosure Bulletin, August 1985, US 
VOL - 28 
NR - 3 

PG - 1248 - 1249 

TXT - - Many speech synthesizers utilize a library of stored 

control -data parameters to direct the actual software 
synthesizer in 

producing the output speech waveform. The number of such parameters 
varies with the type of synthesizer, but usually is within the 
range 

of 10 to 40 parameters. The method described here would be useful in 
optimizing the values of such parameters so that the synthetic speech 
power spectrum (amplitude vs. frequency) most nearly conforms to 
natural speech power spectra. Traditionally, the grid search 
technique is used to fit curves with simple mathematical expressions 
(such as gaussian, trigonometric, or polynomial functions) to 
experimental data. Here, the technique is applied to a 
mathematically complicated function, the synthetic speech power 
spectrum, which cannot be described by a simple algebraic expression. 
The Method Let a measure of goodness of fit X2 between the 
synthesizer power spectrum Si and natural target spectrum Ni be 
defined as: ***** SEE ORIGINAL DOCUMENT ***** where s, the 
uncertainties in the natural spectral points, may be set to 1 for 
this discussion. The synthesizer spectrum is a function of the 
control -data parameters cj . X2 may be considered a continuous 
function of the parameters cj describing a hypersurface in 
n-dimensional space. The space must be searched for the appropriate 
minimum value of X2 . The optimum values for cj can be estimated by 
minimizing X2 with respect to cj . Step 1) Initial values for cj 



are given by the current control -data parameters. Step 2) One 
parameter cj is incremented by a quantity Wc (user- selected) , where 
the program chooses the sign such that X2 decreases. Step 3) The 
parameter c is repeatedly incremented by Wc until X2 starts to 
increase, and the minimum value is determined by parabolic 
interpolation. Step 4) X2 is minimized for each parameter. Step 5) 

The above procedure is repeated until the last iteration yields a 
negligibly small decrease in X2 . Applications The current 
synthesizer control-data cj serves as input to the grid search. The 
final values of c returned by the algorithm direct the synthesizer to 
produce a power spectrum most nearly like the human speech spectrum, 
and these new c values may be stored in the library in place of the 
old values. Since the mathematical similarity between natural and 
synthetic speech curves may not necessarily correspond to perceptual 
similarity, sets of parameters may be saved near the minimum X2 for 
subsequent perceptual testing. The 'best' parameters may then 
replace the old values within the library. The method outlined is 
itself computationally fast and has a minimum number of assumptions 
as prerequisite for its use. The power spectra may be smoothed prior 
to the grid search in order to eliminate pitch as a variable in the 
calculation. This technique can provide an aid to achieving the goal 
of almost all speech synthesis: the production of a natural and 
intelligible speech output. 

- (C) IBM CORP 1993 
NN83113071 

General -Usage Remote-Access Storage and Forward Message Handling 

IBM Technical Disclosure Bulletin, November 1983, US 

26 

6 

3071 

The technique discussed herein enhances the ability of a 
telephone desk set to offer automatic call answering and store and 
forward capability. The logic processing discussed can be applied to 
any telephone or private branch exchange (PBX) system. The 
General -Usage Remote -Access Storage and Forward Message Handling 
allows a caller to leave information at either a busy or unattended 
telephone. The telephony system being discussed allows for 
acquisition of message information without recourse to voice 
digitalization/storage . The telephony management system uses speech 
synthesis to advise a caller that the phone is either unattended or 
busy. A canned, synthesized message is used for these purposes. The 
caller is advised that by using his push-button key pad he can leave 
his telephone number by simply rekeying it in. The caller is then 
prompted to leave his name by the following push-button sequence for 
each character in the caller's name: 1. The push-button key 
containing a respective 

character of the caller's name is touched. 2. Immediately 
afterwards, the number 1, 2, or 3 is 

touched to indicate which character on 

the previous stroked key was the 

intended entry. In this manner, a 

person's name can be spelled. Q and Z are entered 

as if they were inscribed on the "1" push button. A priority 

can also be entered by striking the appropriate push buttons as 

prompted by the speech synthesized instructions. 

Hence, "1" can 

indicate urgent, "2 11 return this call at your convenience, and 11 3" 
return this call today. The system can then automatically time stamp 
the call. All instructions and prompts that the caller hears are 



■ . speech synthesized. This allows precise, clear 
instructions to be 

canned when the system is produced yet leaves the telephone owner the 
prerogative of adding a personalized introductory or ending message. 
This is done by composing the personalized message in 
machine -readable form and then having it synthesized and appended to 
the canned message. Having the appended message enunciated by the 
synthesizer avoids introducing another voice into the message a 
caller hears. Messages are gotten from the phone either via a CRT or 
by a canned voice on the phone spelling the caller's name, giving the 
telephone number and priority. 

12/12 - (C) IBM CORP 1993 
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TI - Audio Indication of Error in Speech Recognition. December 1980. 
PUB - IBM Technical Disclosure Bulletin, December 1980,. US 
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TXT - 2p. A technique is described for providing an audio indication 

of the recognition reliability in speech recognition and altering the 
speech quality in the speech synthesizer. 

In a recently proposed speech compression technique, a speech 
signal is recognized by means of a speech recognizer and thereby 
converted into a string of words or phones (units of vocal sound) . 
The resulting string of words or phones is then transmitted to a 
distant location where the speech signal is resynthesized . The 
problem that arises in such a compression technique is that when the 
speech recognizer makes an error, an incorrect word or phone is 
synthesized which sounds as good to the listener as the correctly 
recognized words or phones. 

The subject disclosure provides an indication of the speech 
recognizer's reliability as an auxiliary signal which is transmitted 
in addition to the word or phone string. A speech recognizer 1 uses 
a reliability estimator 2 to estimate its own reliability from the 
likelihood profile for the word or phone in question or from some 
other suitable measure. The reliability indicator is used by a 
speech synthesizer 3 to alter the quality of the resynthesized 
speech. Words or phones with high reliability are resynthesized with 
little alteration, while words or phones with low reliability are 
modified during resynthesis. 

One method of modification is to add noise via a generator 4 to 
the synthesized speech, or to the control 
parameters of the 

synthesizer. Another method is to transmit, in addition to the 
reliability estimate, an alternative word or phone string which is 
resynthesized and mixed with the primary speech, in proportion to the 
reliability estimate. 

In this manner, speech transmitted by 
recognition-transmission-synthesis is provided with an indication of 
its reliability. The reliability is indicated orally, permitting the 
listener to use his own well -developed auditory sense in an attempt 
to reconstruct the correct signal. 
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. . . first two points above relate to converting back-end data from 
its server- dependent format to the infrastructure's canonical 
representation. When pulling data items from back-end sources, the... 
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For example, all IMAP4 mail servers are 

serviced by our IMAP4 facade. Similarly, all common 

news servers can be serviced by an NNTP facade. So, in practice, 

o . . . 

1.8 

...ther reducing the number of number of facades, we designed a 
facade for a point web source, which is simply the contents of 
a 

single URL. Common examples of a point web source include stock 
quotes and weather forecasts. Adding another such source simply 
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...lication programmer will likely implement a new program to do the 
formatting; an experienced web publisher will likely choose the 
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TI - Multi-modal Data Access 

PUB - IBM technical Disclosure Bulletin, October 1999, UK 
NR - 426 
PG - 1393 

TXT - With the proliferation of pervasive devices such as cellular 

phones, smart phones, Palm Pilots (WorkPads) and other PDAs, it is 
becoming necessary to provide multi-modal access to personal 

data such as e-mail, calendar and address book. And, as one would 
expect, such solutions are emerging. 

However, most (or all) of these solutions tightly integrate the 
user devices to the back-end. For example, one company might 
provide access to e-mail through voice. Another might offer calendar 
through browsers and PalmPilots. 

In this paper, we describe an open, standards -based approach to 
this problem. Rather than specifying which back-ends are accessible, 
we define an open method for adding back-ends. Similarly, we 
describe an easily extensible mechanism for producing clients 
implementing various modalities. 

In designing our solution, we assumed that we were not permitted 
to alter either the data sources or the devices. Thus, we must 
access the sources using whatever protocols they currently export, 
and we must deliver the content to the devices in whatever format (s) 
they can render. 

Our resulting infrastructure consists of three layers: interfaces 
to back-end data sources, which we call facades; an input processor, 
which we call a request multiplexer, or ReqMux; and a set of output 
formatters. Below, we describe each of these components. As shown 
in Figure 1, each request flows from a client into the ReqMux, which 
passes the request to a facade, which passes the results to an 
output formatter. A request contains a request type, such as get 



today's calendar entries or get message N, and a client-device 
indicator, which is used to determine the output format, as described 
further below. 

Facades are the liaisons between our infrastructure and the data 
sources. Each facade has three basic responsibilities: extract data 
from a source; convert the dat a from the source -dependent format 
to our canonical representation; and export the sources' s commands to 
the remainder of our infrastructure. 

The first two points above relate to converting back-end data from 
its server -dependent format to the infrastructure's canonical 
representation. When pulling data items from back-end sources, the 
facade is we assume that the sources are fixed -- that is, they will 
not be modified to accomodate our infrastructure -- so any 

changes in data format required by the infrastructure must be made 
by the facade. Since the source is fixed, each facade must use the 
protocol exported from the source. For example, we've implemented 
P0P3 and IMAP4 facades for mail retrieval. 

Once the facade has extracted data from a source, it transforms 
the data into a canonical internal format. This representation 
allows the system to normalize differences among sources. We use XML 
for our representations. For example, the mail facade can produce an 
XML document representing an inbox, such as: 
<?xml version="l . 0" encoding="US-ASCII n standalone="no"?> 
<°D0CTYPE Mail SYSTEM " MML . DTD " > 
<Mail> 

<MessageSummary id= ,l ml"> 

<Date>Fri Jan 29 13:58:36 EST 1999</Date> 

<From>rak@us . ibm. com</From> 

<Sub j ect >Kenny</Sub j ect> 
</MessageSummary> 
<MessageSummary id= fl m2 n > 

<Date>Fri Jan 29 13:59:02 EST 1999</Date> 

<From>dlk@us . ibm . com</From> 

<Subject>Cartman</Subject> 
</MessageSummary> 
</Mail> 

To accomodate the various client devices, this XML representation 
is reformatted by the output formatter, as described below. 

Facades also define the commands that are valid for their 
respective sources. When facades register themselves with the ReqMux 
(described below) , they pass a handle to themselves, along with a 
list of supported commands. For example, a mail facade might support 
get inbox, get message N and delete. 

As we describe further below, interpreting certain of these 
commands requires knowledge about the state of source, and this state 
information is stored at the facade. For example, the e-mail command 
get next only has meaning if one retains the index of the last 
message accessed. 

Finally, to improve performance, facades can cache data from their 
sources. Like all caching, this should be transparent both to the 
source and to the remainder of the infrastructure . 

Since facades typically only implement a single protocol, each 
time a new type of source is added, a new facade must be written. 
Fortunately, for many common cases, standard protocols exist, so 
facades can be reused. For example, all IMAP4 mail servers are 
serviced by our IMAP4 facade. Similarly, all common 
news servers can be serviced by an NNTP facade. So, in practice, 
only 

a small number of facades are required. 

Further reducing the number of number of facades, we designed a 



facade for a point web source, which is simply the contents of 
a 

single URL. Common examples of a point web source include stock 
quotes and weather forecasts. Adding another such source simply 
requires adding a new URL and corresponding command (e.g., 
<http://www.weather.com, get weather>) to the point source facade. 
This can be accomplished through an HTTP request sent to the point 
source facade . 

The ReqMux receives client requests, and passes them to the 
facades for processing. Its primary function is deciding which 
facade should handle an incoming request. The ReqMux maintains a list 
of <command, facade> pairs. When it receives an incoming request, 
the ReqMux uses this list to determine which source will 
handle the request . 

In some cases, there is a single source for a request, and the 
choice is straightforward. For example, if the request is get inbox, 
then that will be routed to the mail source. 

However, some requests can be handled by multiple sources, with 
the proper choice determined by the system state . The ReqMux 
maintains enough state data to handle such cases. For example, the 
get next request might be valid for both a mail source (get the next 
message) or a calendar source (get the next meeting 

on the calendar) . In this case, the request is routed to the 
source that handled the previous request. If no request preceeded 
this one, or if the source for the previous request does not support 
the ambigious request, then an error is triggered. 

Once the data are retrieved from a source and converted to 
canonical form, they are ready to be formatted for the client 
device. Recall that a client-device indicator flowed in the initial 
request into the infrastructure, and it is preserved as data flow 
from component to component. That indicator is used to select an 
output formatter appropriate to the device. 

Different devices can require different modalities (e.g., speech 
vs. HTML) or they might impose different constraints within the same 
modality (e.g., a PC browser vs. a PDA browser). Each supported 
variant requires an appropriate formatter. 

We considered three ways to implement the formatter: Application 
Program; XSL based Style Sheet script; and JSP based script. 

These choices vary in how they describe the transformations 
needed. The transformation implementor will likely choose among the 
technologies based on personal preference and expertise. An 
application programmer will likely implement a new program to do the 
formatting; an experienced web publisher will likely choose the 
JSP 

based approach; and SGML authors will likely favor the XSL approach. 

In our prototype system, we include two output formatters, one for 
speech, and one for browsers. The speech formatters transforms the 
canonical format into JSML; the browser formatter transforms it into 
HTML suitable for both PCs and PDAs. We implemented both of those 
formatters using the Application Program technique, as well as the 
XSL style sheet technique. 

We considered two ways to exploit the application program 
technique. The first manipulates the in-memory DOM tree; the second 
leaves the DOM tree intact, but changes the way it is printed. 
(Recall that a DOM tree in an in-memory representation of an XML 
document.) Both variants begin by using standard DOM 

APIs to read the XML document, and produce a corresponding DOM tree. 
We use IBM's XML4J parser to create the DOM tree. 

When manipulating the DOM tree in memory, the goal is to find the 
nodes of the tree that contain the tag to be replaced, and to change 



the text in those nodes into the new tag text. For example, our mail 
messages contain a <FROM> tag, but that tag has no meaning in JSML. 
We choose to translate <FROM> into <SENT>, 

which is the JSML sentence tag, and causes the voice 
synthesizer 

to read the entry as a sentence. (Note that both the <FROM> and tags 
are represented by the same node of the DOM tree. Consequently, 
changes to the "FROM" text affects both delimiters.) 

In XML parlance, tags are represented in the DOM tree by nodes of 
type Element. Thus, the algorithm is to examine the entire DOM tree 
searching for Elements, and when an Element is found, to compare the 
Element's text to the target text. In our example, we're looking for 
FROM. When we find a match, we use an XML4J 

method to change the name to the new text, in our example, this 
changes FROM to SENT. (Note that this method is not part of the DOM 
standard; it was added by the XML4J developers.) When this completes, 
all FROMs are SENTs. 

There is one further complication: we don't simply want to speak 
the text delimited by the <FROM> tag; we also want to speak the word 
"from." This requires that we insert a text node into the DOM tree as 
the first child of the SENT element. This text node contains the 
word "from." This causes the synthesizer to speak 

"from" before speaking the text in the <FROM> field. We insert this 
node using another XML4J method. 

When manipulating the string representation of a DOM tree, instead 
of changing the internal representation of the tree, we change the 
way the DOM tree is rendered as a string. To convert the tree to a 
string that embodies the formatted XML document (either JSML or 
HTML) , we execute code to traverse 
the tree, rendering the DOM tree node-by-node. 

To convert the DOM tree to a string, we must visit each node in 
the tree. Conveniently, XML4J comes with several classes that 
automatically visit each node in a DOM tree. They vary in the order 
in which the nodes are visited. We use the 
NonRecursivePreorderTraversal class . 

This class takes as a parameter a class that implements the 
Visitor interface, where Visitor refers to the design pattern of that 
name. (1) The Visitor interface is used to perform operations on each 
node of a DOM tree, and the operation performed depends on the node's 
type. 

The Visitor interface requires that methods be defined for each 
DOM- tree node type. However, since we are only altering tags (in our 
example, "FROM" tags) , which are Element nodes in the DOM tree, all 
other types of node are left unchanged. Thus, in our subclass that 
implements the Visitor interface, for all other 

types, the methods do nothing. In our Element -handling method, we 
compare the Element's text to the target text. If the text doesnot 
match the target, the text is printed; if it does match, the 
replacement text is printed. 

In summary, both techniques described above are quite similar: 
they both examine each node in the DOM tree, searching for test of 
node type (that is, the test of whether the node is an Element) is 
made explicitly by the programmer's code; when using the XML4J's 
Visitor pattern, the base class does the test for us, and simply 
calls an appropriate method when an Element is encountered. Also, the 
Visitor technique leaves the DOM tree intact, which permits us to 
perform additional operations on the original form. After 
considering both techniques, we chose to implement the Visitor 
technique . 

The XSL style sheets express through a pattern matching language 



what transformations are to be performed. The style -sheets are 
applied by means of chaining the data source servlets with a servlet 
developed by our team in conjunction with the Websphere Application 
Server (WAS) team. The XSL style-sheets we developed are stored 
in 

the WebSphere, and selected based on the output type requested by the 
client . 

Other formatters based on JSP technology are conceived but remain 
unimplemented . 

The choice of formatting technology will vary widely. We expect 
the choice to primarily based on the knowledge and experience base of 
the implementor rather than on the goodness of any particular 
technology. 

Where possible, our prototype leverages existing web 
infrastructure. As shown in Figure 2, each component listed above is 
implemented as a servlet written in Java. Parameters passed from 
clients to the infrastructure (e.g., device indicators) are embedded 
in HTTP requests. Our servlets are tied together via servlet chaining 
as implemented by IBM's WebSphere product. We use Apache as our HTTP 
server . 

The ReqMux must know about the available sources. In our first 
prototype, this information was configured statically. A systems 
administrator configured the ReqMux with a list of available sources 
and the commands available for each source . 

We later augmented our ReqMux and facades to allow dynamic 
registration. When a facade comes on-line (with its corresponding 
source) , it registers with the ReqMux via an HTTP flow. The facade 
passes it's URL and the list of valid commands. 

The ReqMux must then update its request -routing table to reflect 
the new sources. Commands that do not overlap with any registered 
commands are simply added to the table. If the same command has 
previously been registered by another facade, then the conflict must 
be noted in the table. Invocations of such requests are resolved 
using state information, as described above. 

This system allows users to access multiple data sources 
seamlessly. However, access to multiple data sources requires that 
the user be authenticated to each source. One could require that the 
user enter the password for each source individually, but this is 
overly cumbersome. 

Commercial global sign on (GSO) products handle this problem by 
allowing the user to enter the passwords for each data source 
once. The user authenicates once with the infrastructure, which then 
acts on his behalf when interacting with the sources. 

We created a modestly secure GSO system for our prototype. Our 
system is clearly not sufficiently secure for commercial deployment, 
but it served our purposes . 

The point of this system is to allow users to access sources 
multimodally . In the easiest case, they use a browser to generate 
the HTTP requests that drive our system. Typically, the process 
begins when a user enters the system login URL, and is challenged for 
a user ID and password. 

If the user successfully authenticates, the system returns its 
main web page. This page includes the list of valid 
requests, and 

the user selects by clicking on hyperlinks. In short, this is a 
typical web experience . 

Note that our HTML formatters are optimized for small screens. 
(We used an HP 660LX for our experiments.) Thus, the same output 
will suffice for both PCs and PDAs. However, since we optimize for a 



PDA, we expect that an alternative formatter could produce a richer 
experience on a full-size PC. 

Our prototype speech client is a Java application, which means it 
is typically run from a command- line, rather than from within a 
browser window. Once the application is started, it works much like 
the HTML client. 

For voice recognition, we use IBM's ViaVoice 98 Executive 
Edition (tm), supplemented by speech enablement for Java. That 
package maps the JSML API onto ViaVoice. 

First, the user must authenticate himself, which he does by speaking 
the command: login <username>. 

The user must then complete the authenication by validating that 
he is the user he asserted he is. If the user has access to a 
keyboard, or even a numeric keypad, he can supply a password or PIN 
as he would in a text situation. Alternatively, smartcards or 
biometric identification equipment can be used, where available. 

However, in a speech-only environment, these solutions are not 
practical. Instead, we use a challenge/response mechanism. The user 
preconf igures a number of questions and answers. When he logs in, 
the system selects one of these questions at random. His response is 
compared to the answer previously provided. While completely 
satisfactory, it will suffice for the purposes of our prototype. 

One additional complication is that the quality of free-form voice 
recognition is still quite poor. Thus, were we to rely on a 
free -form answer to the password challenge, we'd of en get cases where 
the user's correct response was recognized as an incorrect word. 

To improve recognition quality, we use a restricted grammar. As 
described in the Java Speech API homepage, the speech recognition 
engine can be supplied with a BNF-like grammar that defines which 
phrases are acceptable. Our infrastructure populates the grammar with 
the correct answer and a number of incorrect 

choices. When a word (or phrase) is spoken, the engine determines 
whether or not it matches a token in the grammar. If so, we pass that 
token to the server for the comparison with the correct answer; 
if 

not, we report failure to the server. 

For example, if the user registered the question, "What is your 
dog's name?," and the correct answer is "spot," the system would also 
send down such incorrect responses as "tiger," "muffin," etc. A 
correct answer would be returned only if the recognition engine heard 
"spot;" failure would be returned if it heard "tiger" or 
"muffin, " or if it could not match the response to one of the valid 
phrases . 

In some cases, the user will not want to run the speech 
application on his local device. Yet, a user with a only a 
screenless cell phone should still have access to his data sources. 
In such cases, we employ a dial-in proxy, as shown in Figure 3. 

The user calls the proxy, which answers the call and runs our Java 
application. The connection between the user and the proxy is 
standard voice over cellular; the connection between the proxy and 
our infrastructure uses HTTP over the internet. 

While we have not yet created a SpeechML client, it is worth 
discussing the differences between JSML and SpeechML. JSML is a 
class library that exports speech functions in a Java environment. 
Much like the structure of GUI programs, JSML programs typically have 
a main routine that waits for events from the (speech) UI . 

In contrast, SpeechML programs are typically structured more like 
a series of web pages. Each "page" consists of three 
types of 

components: spoken "prompts," a list of valid responses to each 



prompt, and actions that are to be performed for each valid response. 
Prompts often have a form such as: 
Say 'one 7 to get your urgent messages 
Say 'two' to get your non-urgent messages 

Valid responses to these prompts are, of course, 'one' and 'two.' 
Actions tell the SpeechML broswers how to react to each spoken 
response. As in a standard web environment, actions are 
typically 

hyperlinks. In our example above, the linked pages might contain the 
urgent and non-urgent messages . 

Since SpeechML is quite similar to HTML, SpeechML output from our 
formatters would be quite similar to HTML output. Of course, 
additional attention must be paid to ensure that the spoken prompts 
are easily distinguished auditorially, and that valid responses can 
be distinguished by the speech recognition system. 

We've discussed an infrastructure that allows multimodal access to 
a variety of data sources. Our standard architecture for adding data 
sources reduces the impact such additions have on the rest of the 
system. Similarly, adding new modalities has no impact on either 
existing source or other existing clients. 

Notes: (1) Recall that the Visitor pattern is useful when "the 
classes defining the object structure rarely change, but you often 
want to define new operations over the structure." This is exactly 
the situation with the DOM tree. The objects in the tree are 
standardized, and thus will change infrequently, but the operations 
to be performed change with each new desired transformation. See 
also Design Patterns, Gamma, et al . , p. 331. 
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TXT - IBM's recent announcement of an Internet-enabled car introduces 

possibilities for new mechanisms for information delivery. Here we 
describe a process for customized delivery of information to a 
person' s 

automobile. The goal of the process is to allow the user to get 
relevant information delivered in a time-efficient manner. The 
process, in short, is simple: have the user's home PC surf the web 
for 

him gathering material; translate the material into audio format; 
send 

the audio to the car and store it; and have the car replay the audio. 
In more detail, the process is comprised of the following 

steps : 

1) information gathering and filtering 

2) audio production 

3) delivery of the (audio) information to the car 

4) storage of delivered material for later replay 

5) replay 

The information gathering uses standard web techniques. A 

user 

specifies topics of interest (e.g., U.S. politics, soccer, middle 
east) , 

and his computer stores these in a profile. Overnight, it uses 
standard 

search engines to locate pages matching the search criteria. It then 
downloads only pages from designated sites (e.g., CNN, NY Times, 



ESPN) 

created since the last search. 

Alternatively, the user can use existing customized news 
services, such as MyYahoo (http://my.yahoo.com) to gather the news. 

The web pages are then run through a speech 
synthesizer to 

create an audio file. Several speech synthesizers are 
sold 

commercially. 

The information is then transmitted to a receiver in the 
designated vehicle. Transmission can use one of several techniques, 
including a cellular telephone call, or more economically, a 900MHz 
transmitter and receiver. (900MHz telephones, which contain a 
transmitter handset -- and receiver -- base -- can be purchased 
for under $70.) Other transmission mechanisms are possible. 

The information is then stored by the car either in RAM or a 
writeable media such as a writeable CD or a hard drive. IBM's Bamba 
audio format 

(http : //www. alphaworks . ibm. com/examples/bambaf or java/example . html) 
requires approximately 6 Kbits/sec to transmit audio, so (e.g.) 30 
minute 

of recording requires about 10Mbit or under 1.5MB. Each megabyte of 
commodity RAM is extremely inexpensive, so the storage is 
economically 
feasible . 

That information can be replayed by the user upon request. 

Note that if the car actually contains a processor as in the 
Internet car -- the information delivery and audio 
generation steps 
can 

be swapped. In fact, the need for the user's PC can be eliminated if 
the 

user is willing to allow the car to perform every step. Since the 
cost 

of connecting a mobile device to the Internet is still rather 

high 

via 

cellular, and 900MHz is low bandwidth, such a tradeoff is not 
currently 

economical in most cases. 

Similarly, after the information gathering step, the 
information can be transmitted to the car's computer as text, 
and the 

audio can be synthesized by the car's computer itself. 

Other variations on the process will be obvious to one skilled 
in the art . 
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and Telephones into a Low-Cost Home Network 
PUB - IBM Technical Disclosure Bulletin, December 1997, US 
VOL - 40 
NR - 12 
PG - 23 - 24 

TXT - This document contains drawings, formulas, and/or symbols that 

will not appear on line. Request hardcopy from ITIRC for complete 
article . 

Disclosed is a low-cost method and apparatus to solve the 
problem of inconvenient access to the Personal Computer (PC) . A home 



network (3 6) (shown in the Figure) consists of a PC (2) , a 
phone -line 

switch (4) , a telephone (8) , a wireless pointing device (26) , a 
Television (TV) (16) , a pair of TV signal transmitter (10) and 
receiver 

(12) , a pair of audio signal transmitter (28) and receiver (30) , a 
speaker (34) , control signal transceivers (18) and (20) , and a home 
appliance (22) . The key features of this network are incoming 
message 

alert, remote access to the PC (2) with the telephone (8) , and 
web 

surfing and Compact Disk-Read Only Memory (CD-ROM) game playing on a 
TV 

(10) . 

There are two methods to implement the incoming message 
alert. The first one uses the speaker (34); once the PC (2) receives 
a 

voice message or an e-mail, it drives the audio signal transmitter 
(28) 

to send a voice signal to the receiver (30) through wires or via 
radio 

frequencies. The speaker (34) can then announce the type of message 
and name of the person who should receive it. The second method uses 
different ring patterns to identify different messages. For 
instance, 

one ring represents an incoming e-mail for person A and two rings 
indicates an incoming e-mail for person B. The different ring 
patterns 

can be generated by activating the phone-line switch (4) to ring the 
phone or a sound generator, such as a wireless door chime. 

For remote access to the PC (2) , a user issues commands and 
listens to voice feedback from the PC (2) through the phone-line 
switch 

(4) and telephone (8) . By default, the telephone (8) is connected to 
an external phone line until the off -hook signal and a specific 
keypad 

stroke pattern (for example, ##1) are detected. After this event 
occurs, 

the phone-line switch (4) connects the telephone (8) to the PC (2) 
which 

executes the tasks for voice commands and text-to-speech 

functions . 

The 

phone- line switch (4) enables the user to retrieve voice and 
electronic 

mails, get stock quotes from web, and control the home appliance 
(22) 
in 

any place with a telephone (8) . 

For web surfing and CD-ROM game playing, visual feedback 

is 

essential. A pair of low-cost TV transmitter (10) and receiver (12) 
sends the image generated by the PC (2) to the television (16) ; the 
transmission media can be radio frequencies or residential wires. 
With 

a wireless pointing device (26) and the telephone, the user can 
comfortably surf the net or play interactive CD-ROM games in his 
living room while the PC (2) is in somewhere else, for example, the 
study room. 
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TI - Notification of Availability of Office Equipment through Telephone 
Call 

PUB - IBM Technical Disclosure Bulletin, February 1995, US 
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TXT - This document contains drawings, formulas, and/or symbols that 

will not appear on line. Request hardcopy from ITIRC for complete 
article . 

This article describes a mechanism that notifies a user of 
office equipment, with a telephone, that the equipment becomes 
available . 

Office equipment, such as copiers and OHP foil makers, are 
usually shared among persons in groups, and located in places which 
may be far from desks of some users. This sharing normally increases 
the cost-effectiveness by increasing utilization of the equipment. 
However, in terms of the utilization of human resources, this sharing 
usually increases time wasted for doing useless activities. For 
instance, if a person brings sheets of paper to a copier and finds it 
busy (used by another person) , he has to wait for the copier to 
become available, or return to his desk and come back again after 
some time. Or, he may have to ask the current user to let him know 
when the copier becomes available. (Here, a copier is used just as 
an example of office equipment, and other equipment can be 
applicable . ) 

With the mechanism in this article, a user need not wait at the 
copier until it becomes available, nor ask the current user to let 
him know when it becomes available. Instead, he enters his telephone 
number so that the copier informs him of its availability later by a 
phone call. 

Fig. 1 shows an overall configuration in which telephones and 
copiers are connected through PBX (private branch exchange) 
network . 

Although telephones and copiers are shown to make phone calls each 
other in Fig. 1, only copiers make a phone call to a telephone in 
this article. A copier notifies a person of availability with 
synthesized or recorded voice messages. 

Fig. 2 shows internal components (devices) of typical office 
equipment with this mechanism, where Functional Component and User 
Interface are common to conventional office equipment, and the rest 
of components are added for this mechanism: (1) a device that places 
a telephone call, (2) a device that "speaks" messages, which are 
synthesized or recorded in advance, (3) a device that answers a 
telephone call, (4) a device that recognizes DTMF tones as requests 
or commands, (5) a device that process commands issued as DTMF tones 
or from User Interface, (5) a device that monitors the status of the 
equipment, makes a phone call to the current user, and drives the 
Speak device to notify the status change with voice messages. 
Components (3) and (4) are shown for completeness, and are not used 
in this mechanism. 
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TXT - - This article describes a technique by which a personal 

computer (PC) - based automation controller can communicate in a 
flexible manner with remote personnel or machines . 

Automation equipment is typically controlled by industrial 
computers. These PC-based systems provide an easy, flexible, 
strategic, as well as common, programming and hardware architectural 
environment. The voice communications adapter is a card for the PC 
that allows the PC to recognize or synthesize the human voice as well 
as controlling a telephone line and acting as a modem. 

Before a tool is shipped from a plant, it is usually operated 
for a period of time to produce and eliminate any early life 
failures. In order to meet expected production schedules it is 
necessary to consider running the tools unattended. The problem, 
however, is the possibility of soft failures that could disrupt the 
process early. In response to this particular problem, the 
controller is provided with subroutines that enable the program to 
select personnel names and numbers from a file, and dial that number 
in an effort to locate and convey the current error status of the 
machine . 

The voice communications adapter disclosed herein provides for 
the control and monitoring of a telephone extension. Among the 
functions that are provided is the ability to take the handset off 
hook, identify a dial tone, dial a number, check for a carrier from a 
modem, check for a possible response from a human voice, 
generate 

speech over the phone from text strings, and identify digit tones. 
By using these functions, a means is provided to the controller of a 
given tool to access a file containing necessary data to locate and 
identify a remote person or machine by telephone. Exception handling 
is provided by virtue of the ability of the software to obtain the 
status of the line and to monitor for given conditions. The 
application could be programmed to transmit ASCII data or speech in a 
flexible manner based on program logic and with feed back and or 
verification from a party using the tones generated by a TOUCH-TONE* 
phone. The intended result of the application described in this 
disclosure is to provide a set of subroutines that are enabled by the 
PC operating system to use the features of the voice communications 
adapter to communicate in a flexible manner with remote locations 
using an existing network (the telephone) . Initial code was 
written 

in C to support the above functions. 

The drawing shows in block diagram a flexible voice 
communication system for a manufacturing environment. 
* Trademark of AT&T Co. 
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TXT - - Disclosed is a telephone system, based on, known technologies, 

which automatically dials an intended individual's number in response 
to a user's spoken request for that person by nickname. Referring to 
the block diagram in the Figure, the system operates as follows: A 
user wishing to call a person named "Joe Guy" employs handset 1 to 
verbally make the request "call Joe". This vocal message is received 
and digitized by Voice Input Processing module 2. Controller 3 then 



passes the digitized request to Pattern Matching Processing module 4 . 

This module searches the pre -store data base in Voice Match Data and 
System Messages module 5 for a nickname matching the current request 
for "Joe". If a match is found, the "Joe" data record is copied to 
the Controller, The Controller, using this data and Voice 
Synthesis 

Processing module 6, generates the spoken message to the user 
"calling Joe Guy" . No verbal response from the user within some 
predetermined interval indicates confirmation. The Controller then 
transfers the numeric telephone number data from the "Joe" data 
record to Network Access Processing module 7 which proceeds to 
dial 

the number and connects the called line to handset 1. The telephone 
number may be for a local PBX extension, tieline, WATS line, outside 
local or long distance exchange line, or any other line that the user 
could dial himself. If the confirmation statement generated in the 
above case is "calling John Jones", a misunderstanding has occurred. 
The user repeats his request "call Joe", and the system tries again. 
If Pattern Matching Processing module 4 cannot find a match to the 
requested nickname, it may be an indication that the request has been 
misunderstood. The Controller then generates the spoken message to 
the user "repeat request". A second successive failure may indicate 
that "Joe" is not on record. Given the above functional capability, 
it is apparent that the system is not limited to the staightf orward 
usage described above. The data record for "Joe" can include multiple 
telephone numbers to be selected from depending on the date or time 
of day, (office, factory, home, other known locations) , (network 
cost 

differences vs. time, network availability, etc.). The system 
can be 

used as a directory, permitting a user to call from a remote 
location, verbally request "number for Joe" and receive a 
system-generated voice response, "number for Joe Guy is 
five five 

five seven three one two" . Considering the possibility of 
distinguishing different voices, "Joe" may be different people to 
different users. 
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TXT - - This system provides an effective mechanism for generating and 

delivering a computer-controlled personal reminder regardless of what 
application program is executing. The system is based on an 
independent time-keeping process. On the IBM PC, under PC-DOS, this 
process can be a device driver as described below. Under other 
operating systems on the IBM System/370, the process can be performed 
by a separate virtual machine or a task running without an attached 
terminal. The independent time-keeping process (REMINDER DEVICE 
DRIVER) is used to control the storing and delivery of reminders. 
The device driver is a piece of virtual hardware that acts like an 
input /output device. An application program, such as a calendar, can 
write reminders to the REMINDER DEVICE DRIVER which then acts as a 
storage device and clock watcher. At the appointed time, the 
REMINDER DEVICE DRIVER verifies the best means to deliver the 
reminder, e.g., signal to the calendar program, audible alarm, 



computer synthesized voice, telephone call, or other 
appropriate 

means. Each time a new reminder mechanism is added to the system, a 
message must be written to the REMINDER DEVICE DRIVER so that this 
means can supersede the current means as the best available. In 
addition, each time a mechanism is removed from the system, a 
corresponding message restores the prior best means for delivery. The 
novelty here is in connecting the device driver to an application 
that maintains its primary data at another network node or a host 
computer. In addition, the connection of the alarm capability to a 
more natural user interface (such as a telephone or computer 
synthesized voice) enhances the REMINDER DEVICE DRIVER to 
make it 

more usable and understandable. 
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TXT - - This article describes a universal tone receiver/transmitter 

module which provides a telephone switching system with one module 
for all tone signaling. The application of all digital technology 
offers many advantages over present methods of telephony-oriented 
tone generation and detection. Adaptation to a telephone company 
(telco) tone plan anywhere in the world requires no physical or 
electrical adjustments, as the CPU can be programmed to update the 
tone plan parameters. Each tone plan may also have many subsets of 
tones. For example, in the USA, tone subsets include the dual -tone 
multi- frequency (DTMF) plan, the multi- frequency (MF) plan, and the 
progress tone plan. Other countries have their own unique tone plans. 

The tone module of this disclosure easily adapts to these varied 
tone plans with only software changes. Other private automatic 
branch exchanges (PABXs) use multiple module types to perform these 
tone signalling functions. This older technique of multiple modules 
requires many different types of spare modules to maintain a PABX 
installation. It also requires that a different set of tone modules 
be used whenever the installation site uses a non-standard tone 
plan. 

The present universal tone transmitter/receiver module reduces these 
varied tone module requirements to a single module type. Another 
advantage is in the predominant use of digital technology which 
facilitates automated manufacturing and testing. The universal tone 
transmitter/receiver module disclosed herein is designed to be a 
function module in a larger telecommunication system such as a PABX 
as shown in Fig. 1. This universal tone transmitter/ receiver module 
can be used in any application where telephony signalling information 
is processed in the frequency range of 200 to 4000 Hertz. The 
function of this module is to synthesize and analyze audio 
waveforms . 

All data for the generation of audio signals is 
down- loaded from the 

central processing unit (CPU) at system initialization and can be 
updated as the needs of the system change. Types of data involved in 
this audio generation include: frequency or frequencies, 
amplitudes, 

phase angles and duty cycles. Detection of audio energy involves 
determining the spectral components present, their related amplitudes 



and phases. The types of functions this module can perform include: 
DTMF signalling, MF signalling, progress tone signalling, modem 
operations, and voice recognition/synthesis . The present 
tone module 

has eight input ports. These input ports are connected to a 
switching module which connects them to various telecommunication 
ports as the system requirements dictate (Fig. 1) . The incoming 
audio waveforms are multiplexed and then converted to digital sample 
sets using an analog- to-digital converter. The digital sample sets 
are then stored in the memory section of the CPU. The signal 
processor then performs the required operations as defined by the CPU 
on these sample sets in order to analyze the characteristics of the 
audio waveforms. The results of the computations are then stored in 
the memory section awaiting transfer to the CPU. The universal tone 
transmitter/receiver module subsystem is shown in block diagram in 
Fig. 2 and includes a microprocessor-controlled bus interface 1 to 
the next level processor, a digital signal processor 2 to analyze and 
synthesize audio waveforms, an analog/digital converter 3 to 
translate between the digital and analog domains, a multiplexer 4 to 
time division multiplex the signal processor among multiple ports, a 
programmable hardware section 5 to generate commonly used tones under 
the control of the microprocessor, and a memory section 6. The tone 
module has six output ports that are under direct control of the bus 
interface microprocessor 1. The CPU downloads digital 
representations of the audio waveforms to the tone module. The 
digital instructions are decoded, and the one waveform cycle is 
repeatedly sent to the digital/analog converter 3 . Optional 
parameters for the transmission of the waveform include time 
durations, on/off cycle times and alternating between two different 
waveforms. The signal processor 2 has direct control of two output 
ports. These ports are for infrequently used messages and d messages an 
speechsynthesis . To output on these ports, the CPU transmits to 
the tone 

module a digital data string containing the waveform information. 
The signal processor composes the desired output string in a digital 
format and sends it to the digital/analog converter. The analog 
waveform is demultiplexed to one of the two output ports. The 
synthesized audio waveforms at the output ports of this tone 
module 

are resources for the PABX system to utilize as the network 
requires . 

9/13 - (C) IBM CORP 1993 
AN - NN85014989 

TI - Optimal Retention Delay in the Receiver of a Digital Voice Network 
PUB - IBM Technical Disclosure Bulletin, January 1985, US 
VOL - 27 
NR - 8 

PG - 4989 - 4990 

TXT - - At the receiving end of a digital voice packet transmission 

network, voice synthesized signals have to be 
re -synchronized due to 

packet transmission delay jitter. To avoid these distortions, instead 
of being fed into the synthesizer as they arrive, the packets 
are fed 

into a buffer register. They are then fetched out of the buffer 
sequencially after a fixed length retention delay greater than the 
maximum transmission jitter expected. This retention delay is reset 
after each long speech pause (e.g., greater than a predetermined 
length value) . The transmission delay jitter is thus absorbed by the 



long speech pauses, while effective speech signal 
synthesis is not 

affected. This method may however penalize the system by unduly 
affecting slow packets. A solution is proposed here to avoid these 
drawbacks. After a long pause, the first incoming packet is 
systematically fed into a queue buffer at a position from which it 
would be extracted for further being used for voice 
synthesis after 

an initial retention delay greater than the maximum expectable 
network delay for the considered packet channel used. The packet 
is 

then systematically shifted toward the buffer output at a rate equal 
to the packet transmission rate. The subsequent packets are then 
normally fed into the buffer register. The queue buffer position 
corresponding to the initial retention delay is made to represent a 
buffer threshold. In operation, the threshold position is permanently 
monitored. Assuming the incoming packets are all introduced above the 
threshold, the threshold is made to shift one position lower at the 
first incoming packet following the next long pause. This operation 
may be repeated until a first packet is made to occupy a queue 
position below the threshold, or vice-versa. The above method could 
be implemented using different means. For instance, a counter CPTR 
could be incremented upon each packet being received. Then a device 
could be used for computing an increment parameter (DELTA) for each 
incoming packet, according to: DELTA (n) = CPTE(n) - CPTR (n) /mod 8 
wherein - n represents the nth received packet, and - CPTE(n) is a 
3 -bit number assigned sequentially to each transmitted packet and 
coded within the packet frame. Then the queue packet (PP) position 
where the incoming nth packet should be fed into would be derived 
from the previous packet position through: PP(n) = PP(n-l) - 
DELTA (n-1) + DELTA (n) /mod 8 The first packet position would be PP1 = 
3. 

10/13 - (C) IBM CORP 1993 
AN - NB84124356 

TI - Carbon Microphone Inverse Filtering 

PUB - IBM Technical Disclosure Bulletin, December 1984, US 
VOL - 27 
NR - 7B 

PG - 4356 - 4357 

TXT - - In a digital voice network using low bit rate voice 

compression 

techniques, a critical problem may appear due to the distortions 
introduced by the combination of carbon microphones and telephone 
lines. More specifically, when the speech coder is located close to 
the speaker, one can use a dynamic microphone and a telephone line 
with quasi- flat frequency response, thus ensuring that a clean 
digital speech signal is available at the input of the speech coder, 
which results in a good quality synthesized speech. This 
solution is 

currently applied in small configurations. However, the price to pay 
in case of a large digital voice network would be too high since 
in 

this case each customer would have its own voice digitizer coupled 
with a special telephone set. For large networks, a solution consists 
in sharing a voice coder for several users which can get connected 
through the public switch telephone network, using already 
installed 

carbon microphones. In this case, the speech coder is located close 
to a PBX. While presenting the best trade-off in terms of 



implementation cost, this solution generally results in a poorer 
speech quality at the output of the synthesizer. This is due to 
the 

distortions added to the speech signal by the telephone line and by 
the carbon microphone. Low bit rate speech coders are generally very 
sensitive to these distortions. One can however reduce these 
distortions by preprocessing the input speech signal before bit rate 
reduction. In this preprocessing, the line distortions are assumed to 
be linear, that is, the line is supposed to mainly introduce a 
frequency attenuation on the signal (band-pass filter) . The carbon 
microphone is modelized as a combination of linear distortion and 
non- linear distortion. It is assumed that the non- linear distortion 
consists in the corruption of the speech signal with a noise 
proportional to the speech envelope. Thus, it is possible to proceed 
to an adaptive filtering of the input speech signal. This filtering 
makes use of a comb filter adapted to the pitch period of the input 
speech, where the coefficients are adapted from the energy of the 
input speech signal. The global linear distortion due to the 
telephone line and to the carbon microphone is compensated by 
prefiltering the input speech with an adaptive inverse filter 
(basically a band-stop filter) , the coefficients of which are adapted 
from the frequency analysis of the input speech signal. 

11/13 - (C) IBM CORP 1993 

AN - NA84034947 

TI - Pseudo Hangover Synthesis 

PUB - IBM Technical Disclosure Bulletin, March 1984, US 
VOL - 26 
NR - 10A 
PG - 4947 

TXT - - In a digital voice network wherein N conversations are to 

be 

transmitted over C equivalent channels (N>C) , the channel assignments 
are based on voice activity detection. The channels are assigned to 
active sources only. In other words, silences are not transmitted. 
However, it is difficult to detect the end of a talk-spurt, because 
the voice activity does not stop instantaneously. To provide smooth 
restitution, the voice sources are still considered active after each 
talk-spurt during a "hangover" time. Such hangovers do, however, 
increase the load of the network. A solution is proposed here to 
minimize the hangover load by not transmitting any voice signal 
during the hangovers and substituting for it a pseudo hangover 
voice 

signal generated at the receiving end of the network. The 
proposed 

solution deals with stopping voice transmission as soon as the voice 
source energy drops under the voice activity detection (VAD) level, 
and to synthesize the pseudo hangover. The synthesized 
voice is 

provided by an attenuated reconstruction of the received voice signal 
prior to VAD indicating the occurrence of a silence. 

12/13 - (C) IBM CORP 1993 

AN - NN83014474 

TI - Voice and Data Transmission. January 1983. 

PUB - IBM Technical Disclosure Bulletin, January 1983, US 

VOL - 25 

NR - 8 

PG - 4474 - 4475 

TXT - 2p. This is a voice and data transmission system. The 



proposed architecture is based on the fact that two main functions 
have to be performed, i.e., voice compression, and voice and data 
multiplexing . 

Voice compression consists in reducing the voices PCM data rate 
from 64 Kbps to 7,200 bps using split band and dynamic allocation of 
quantizing bit techniques. More particularly, the 64 Kbps is fed 
into a Voice Excited Predictive Coder (VEPC) which derives PARCOR 
parameters (K) therefrom; these parameters are used to derive a 
redundance -free residual signal from the original vocal signal. The 
residual signal is fed into a low-pass filter which derives therefrom 
a band- limited or residual baseband signal together with information 
relating to the energy (E) contained within the removed high 
frequency bandwidth. The residual baseband is in turn split into p 
sub-bands, the contents of which are requantized using dynamic bit 
rate allocation techniques. It should be noted that the above speech 
analysis operations are performed over blocks of samples 2 0 ms long. 
The residual baseband is thus processed in block companded PCM (BC 
PCM) . Each block of samples provides one or two E, a block of eight 
PARCOR coefficients, and requantized samples such that the overall 
bit rate is limited to 7,200 bps. 

Conversely, on the receiving side of the transmission 
network, 

energies, PARCORS and samples will have to be recombined for 
synthesizing the original speech signal. 

Both analysis and synthesis of the speech signal are performed 
using a single tributary microprocessor MP1 . Every 125 microseconds, 
the PCM coded data is serially loaded into a shift register SRI, and 
then transferred into an input voice buffer VBI. The status of VB1 
Z. contents is indicated to MP1 by setting one bit in the status 
buffer STAT. Interrupt must be performed within 125 microseconds 
following the status change. 

Incoming voice data information is buffered in MP1 for 20 ms, 
and then compressed to 7,200 bps using the above-mentioned algorithm. 

Also, every 125 microseconds, output buffer VB2 is loaded by 
MP1 and then transferred into output shift register SR2 . 

The second main function of the network, i.e., voice and data 
multiplexing, is then performed. The voice data transmission network 
is mastered by a microprocessor MPo. This microprocessor controls 
the multiplexing of the 7,200 bps speech-originating data with a 
2,400 bps data channel, over a 9,600 bps data link using a 9,600 bps 
modem . 

More particularly, at MPo request (one bit set in the status 
buffer STAT and its associated interrupt) , the compressed data is 
transferred into a MP1 output buffer in burst form on a 16 -bit word 
basis. Similarly at MPo request, a MP1 input buffer, when full, must 
be read by MP1 for decompression, and 64 Kbps PCM sample restitution. 
This operation occurs on a 20 ms basis. 

The compressed voice data transfer from the MP1 output buffer 
to the 9,600 communication link and from this link to MP1 input 
buffer is performed on a byte by byte basis through a full duplex 
communication adapter CCA1 . 

A similar communication adapter (CCA2) is used to interface the 
2,400 bps data channel. 

13/13 - (C) IBM CORP 1993 
AN - NN71043356 

TI - Spectrum Flattening in Vocoders. April 1971. 
PUB - IBM Technical Disclosure Bulletin, April 1971, US 
VOL - 13 
NR - 11 



PG - 3356 

TXT - Ip. In a conventional base-band vocoder synthesizer, the 

speech quality is improved by the so called 'spectrum flattening" 
feature. This is performed by using on each of the excitation 
channels a premodulation network made of a band-pass filter BPFi 
followed by a clipper CLi, L=l, 2..., n with generally n approx.= 15. 

The clipping operation generates odd harmonics which could fall 
within another channel bandwidth, and, therefore, could distort the 
output signal provided by the summing in stage Sigma. These 
harmonics must be removed. In conventional synthesizers the 
removal 

is obtained through use of a second band-pass filter on each channel 
(post -modulation network) . 

In order to facilitate a digital implementation of the 
synthesizer, it is of high interest to remove the post -modulation 
filters. A solution is provided through use of a frequency shifting 
operation. A carrier frequency Fo is used to modulate the excitation 
signal in stage Ml before driving the premodulation filters. The 
frequency bandwidth of each BPFi is chosen such that the lower 
sideband generated by modulator Ml is removed. In other words the 
excitation frequency spectrum is shifted towards the upper frequency. 

Therefore, the odd harmonics generated by the clippers are also 
shifted the same way. The frequency Fo is such that these shifted 
harmonics fall outside the bandwidth of the vocoder channels and, 
therefore, cannot interfere with a useful signal provided by any 
other channel. Thus, the post-modulation filters, not shown, are 
needless and can be replaced by a single low-pass filter LFP1, it is 
easy to implement and remove the odd harmonics. The synthesized 
signal is demodulated by M2 and low-pass filtered to shift the speech 
signal back to the audio band. 
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ABSTRACT EP 598598 Al 

A text parser (34) for a text-to-speech processor accepts a text stream 
and parses the text stream to (36) detect non-spoken characters. A text 
generator generates pre-designated text sequences in response to 
non-spoken characters, such as special character sequences or character 
sequences which match format templates. A speech command generator (37) 
generates speech commands in response to detecting of non-spoken 
characters such as non-spoken characters which affect text style, font, 
underlining, etc. A text-to-speech converter (26) converts spoken text 
parsed by the parser and text generated by the text generator into 
speech, the text-to-speech converter being operable in response to speech 
commands generated by the speech command generator. According to the 
invention, it is not necessary to pre-process text files in preparation 
for text-to-speech conversion, and arbitrary files which contain both 
spoken and non-spoken characters may be converted easily, (see image in 
original document) 
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Text-to-speech system having a lexicon residing on the host processor. 
Text-zu-Sprache Ubersetzungssystem mit einem im Hostprozessor vorhandenen 
Lexikon . 

Systeme de conversion texte/parole comportant un lexique resident dans le 
processeur principal. 
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A text-to-speech system is provided having a host system operable to 
perform a text-to-speech application program. The host system includes a 
memory storing the lexicon for a separate text-to-speech device. By using 
the host system to contain the lexicon, a sufficient amount of memory is 
made available. Therefore, a very complex lexicon can be provided and 
more information made available to the voice synthesizer on the 
text-to-speech device. This partitioning of the text-to-speech system 
between the host system and the text-to-speech device allows the 
computation-intensive processes to be performed on the text-to-speech 
device while providing a large memory to contain the lexicon information. 
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English Abstract 

A system (10) for automatically responding to an inquiry from a user 
comprises a dialogue manager (50), which executes a machine- controlled 
human-machine dialogue to determine a plurality of pre- determined query 
items. The dialogue manager (50) retrieves a plurality of information 
items from a storage (52) in dependence on the query items. The system 
further comprises a presentation manager (90) which determines an 
intention of the user, reflecting a preferred way of presenting the 
information items. The presentation manager (90) selects a presentation 
scenario from a predetermined set of presentation scenarios (96) in 
dependence on the determined intention. At least one natural language 
phrase is generated to present the obtained information items according 
to the selected presentation scenario. A speech generator (60) verbally 
presents the generated phrase (s) to the user. 

French Abstract 

L 1 invention concerne un systeme (10) concu pour repondre automatiquement 
a 1 'interrogation emanant d ! un utilisateur. Ledit systeme (10) comprend 
un gestionnaire de dialogue (50) qui execute un dialogue homme -ma chine 
commande par une machine, pour determiner plusieurs articles 
d' interrogation predetermines. Le gestionnaire de dialogue (50) extrait 
plusieurs articles d 1 information d ! une memoire (52), en fonction des 
articles d * interrogation . Ledit systeme comporte egalement un 
gestionnaire de presentation (90) qui' determine une intention de 
1 'utilisateur, refletant une maniere preferee de presentation des 
articles d 1 information . Le gestionnaire de presentation (90) selectionne 
un scenario de presentation dans un ensemble predetermine de scenarios de 
presentation (96), en fonction de 1 ' intention determinee. Au moins une 
phrase en langage naturel est generee de sorte que les informations 
obtenues soient presentees en fonction du scenario de presentation 
selectionne. Un generateur de parole (60) presente verbalement a 
1 1 utilisateur la ou les phrases generees. 



20/3,AB/4 (Item 2 from file: 349) 

DIALOG (R) File 34 9:PCT Fulltext 

(c) 2000 WIPO/MicroPat . All rts. reserv. 

00692528 

VOICE BROWSER FOR INTERACTIVE SERVICES AND METHODS THEREOF 
NAVIGATEUR VOCAL POUR SERVICES INTERACTIFS ET PROCEDES ASSOCIES 

Patent Applicant/Assignee: 

MOTOROLA INC, MOTOROLA INC., 1303 East Algonquin Road, Schaumburg, IL 
60196, US 
Inventor (s) : 

LADD David, LADD, David, 4141 Downers Drive, Downers Grove, IL 60615, US 
JOHNSON Gregory, JOHNSON, Gregory, 565 Iroquois Trail, Carol Stream, IL 
60188, US 

Patent and Priority Information (Country, Number, Date) : 

Patent: WO 0005708 Al 20000203 (WO 200005708) 

Application: WO 99US16776 19990723 (PCT/WO US9916776) 

Priority Application: US 9894131 19980724; US 9894032 19981002 

Designated States: AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES 
FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU 



Dialog ftpat 9139 25723 



AB 3 



Report for SPE Fan Tsang 08/948328 September 27, 2000 08:00 



LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA 
UG UZ VN YU ZW GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM 
AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM 
GA GN GW ML MR NE SN TD TG 

Publication Language: English 

Filing Language: English 

Fulltext Word Count: 21304 

English Abstract 

A voice browser to process a markup language document. A voice browser 
includes a network fetcher unit to retrieve information from a 
destination of an information source. A parser unit is communicatively 
coupled to the network fetcher to parse the retrieved information based 
on predetermined syntax. The parser unit generates a tree structure 
representing the hierarchy of the retrieved information. An interpreter 
unit and a state machine are also used. The method includes the steps of 
retrieving and parsing a markup language document to determine at least 
one user input, determining whether the user input corresponds to a 
predetermined grammar, and using the predetermined grammar when the user 
input corresponds to the predetermined grammar. The method of determining 
a grammar is based upon phonetic rules and pronunciation. The grammar is 
sent to a speech recognition engine and compared to a user input. 

French Abstract 

La presente invention concerne un navigateur vocal capable de traiter un 
document HTML. Un tel navigateur comporte un module de recherche reseau 
permettant de retrouver une information en provenance d'une destination 
d'une source d ' information . Un module d' analyse est couple communiquant 
au module de recherche reseau de facon a analyser 1.' information retrouvee 
en fonction d'une syntaxe definie. Le module d' analyse genere une 
arborescence representant la hierarchie de 1 1 information retrouvee. Un 
module interprete est couple communiquant au module de recherche reseau 
de facon a traiter le document HTML. Un automate fini est couple 
communiquant au module interprete et au module d' analyse. Un procede 
selon 1' invention consiste a retrouver un document HTML, a analyser le 
document HTML a la recherche d'au moins une entree utilisateur, a 
determiner si cette entree utilisateur correspond a une grammaire 
definie, et a utiliser cette grammaire definie si 1' entree utilisateur 
consideree correspond a la grammaire definie. Le procede consiste enfin a 
reconnaitre la grammaire d'apres des regies phonetiques definies et la 
prononciation lorsque l 1 entree utilisateur consideree ne se trouve pas 
dans la grammaire definie, a envoyer la grammaire a un moteur de 
reconnaissance vocale et a comparer a une entree utilisateur la 
grammaire . 
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Publication Language: English 

Filing Language: English 

Fulltext Word Count: 20939 

English Abstract 

The present invention (Fig. 6) relates to a voice browser to provide 
interactive services. A markup language document in accordance with the 
present invention includes a dialog element (2) including a plurality of 
markup language elements (3-26) . Each of the plurality of markup language 
elements is identifiable by at least one markup tag. A step element (11) 
is contained within the dialog element. The step element includes a 
prompt element (4) and an input element (9) . The prompt element (4) 
includes an announcement to be read to the user. The input element 
includes at least one input that corresponds to a user input. A method in 
accordance with the present invention includes the steps of creating a 
markup language document having a plurality of elements (3-26), selecting 
a prompt element (2), and defining a voice communication (14) in the 
prompt element to be read to the user. The method further includes the 
steps of selecting an input element (2) and defining an input variable to 
store data inputted by the user. 

French Abstract 

La presente invention (Fig. 6) concerne un navigateur vocal capable de 
fournir des services interactifs. Selon la presente invention, on dispose 
d'un document en langage de balisage qui inclut un element de dialogue 
(2) integrant une pluralite d'elements de langage de balisage (3-26) . 
Chacun de ces elements de langage de balisage est identifie par une 
etiquette de balisage. L ! element de dialogue renferme un element d'etape 
(11). Cet element d'etape comprend un element d'invite (4) et un element 
d'entree (9). L'element d'invite (4) comporte une annonce a faire lire a 
1 ' utilisateur . L'element d'entree comporte au moins une entree qui 
correspond a une entree utilisateur. L' invention concerne egalement un 
procede consistant a creer un document en langage de balisage comportant 
une pluralite d'elements (3-26), a selectionner un element d'invite (2), 
et a definir dans l'element d'invite a faire lire a 1 ' utilisateur une 
communication vocale (14). Le procede consiste enfin a selectionner un 
element d'entree (2) et a definir une entree variable permettant de 
ranger les donnees introduites par 1 ' utilisateur . 
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AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM 
GA GN GW ML MR NE SN TD TG 

Publication Language: English 

Filing Language: English 

Fulltext Word Count: 21375 

English Abstract 

The present invention relates to systems and methods to provide a user 
with information from an information source. A system in accordance with 
the present invention includes a communication node including a switch 
having at least one incoming line. An audio processing unit is 
communicatively coupled to the switch to receive incoming audio 
communications from the user and to provide outgoing audio communications 
to the user. A voice browser is communicatively coupled to the audio 
processing unit. The voice browser retrieves information from the 
information source and provides an output to the audio processing unit. 
The audio processing unit provides an outgoing audio communication to the 
user in response to the output. A method in accordance with the present 
invention includes the steps of receiving an audio input from a user 
associated with a destination of an electronic network, connecting to the 
destination based upon the audio input, and retrieving information 
associated with the destination. The method further includes the steps of 
processing the information to generate a voice communication, and 
providing the voice communication to the user. 

French Abstract 

La presente invention concerne des systemes et procedes permettant de 
fournir a un utilisateur de 1 1 information a partir d ! une source 
d f information. Un tel systeme comporte un noeud de communications 
comportant un commutateur pourvu d'au moins une ligne en entree. Un 
module de traitement audio est couple communiquant au commutateur de 
facon a recevoir les communications audio entrantes en provenance de 
1 1 utilisateur, et de facon a fournir a 1 1 utilisateur des communications 
en sortie. Un navigateur vocal est couple communiquant au module de 
traitement audio. Ce navigateur vocal va retrouver 1 1 information dans la 
source d 1 information, puis il realise une sortie a destination du module 
de traitement audio. En reaction a la sortie, ce module de traitement 
audio fournit a 1 1 utilisateur une communication audio en sortie. Le 
procede de 1 1 invention consiste a recevoir une entree audio en provenance 
d'un utilisateur associe a une destination d'un reseau electronique, a se 
connecter sur la destination en fonction de 1' entree audio, et a 
retrouver 1 1 information associee a la destination. Le procede consiste 
enfin a traiter 1 1 information de facon a generer une communication 
vocale, puis a fournir a 1 ' utilisateur cette communication vocale. 
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Inventor ( s ) : 

MOSHFEGHI Mehran, MOSHFEGHI, Mehran , Prof. Holstlaan 6, NL-5656 AA 
Eindhoven , NL 

GLICKSMAN Robert A, GLICKSMAN, Robert, A. , Prof. Holstlaan 6, NL-5656 AA 
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Patent and Priority Information (Country, Number, Date) : 
Patent: WO 9942932 A2 19990826 
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Filing Language: English 
Fulltext Word Count: 3070 
English Abstract 

A Computer-based Patient Record (CPR) system includes user- equipment 
devices which are configured for speech synthesis in response to speech 
markup language text and which are connected via a network to a middle 
tier of a server system. The CPR system further includes a message 
delivery facility for delivery of textual messages to any of pager, 
electronic mail, or voice mail (after text Tto-speech synthesis) 
message delivery vehicles. The server system accesses a user specific 
data store containing speech synthesis profiles which include prosodic 
information of the voices and speech of users, and message delivery 
profiles which specify which of the aforementioned message delivery . 
vehicles are to be used and in what order. The stored speech synthesis 
information associated with an originator of a message and the stored 
message delivery information associated with the recipient of message are 
provided by the server to user equipment or a reminder generator to 
produce speech markup files containing information needed to synthesize 
the vocal and speech characteristics of the originator accompanied by 
delivery instructions reflecting the message delivery preferences of the 
recipient . 
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SPEECH OPERATED AUTOMATIC INQUIRY SYSTEM 

SYSTEME DE RENSEIGNEMENTS AUTOMAT I QUE A COMMANDE VOCALE 
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KONINKLI JKE PHILIPS ELECTRONICS NV, KONINKLI JKE PHILIPS ELECTRONICS N.V. 
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Publication Language: English 
Filing Language: English 
Fulltext Word Count: 7191 

Fulltext Availability: 
Detailed Description 

Detailed Description 

the question/confirmation statements from the dialogue 

71 1= 

manafyer 50 in various forms, such as a (potentially prosodically 
enriched) textual form or as speech fragments . The speech 
generation subsystem 60 may be based on speech synthesis techniques 
capable of converting text -to-speech . The speech generation subsystem 
60 may itself prosodically enrich the speech fragments or text in 
order to generate more naturally sounding speech. The enriched material 
is then transformed to speech output. Speech generation has been 
disclosed in. . . 

...a sentence with certain system-specific voice characteristics (e.g. the 
voice of an actor) and one isolated utterance (his own) in between. 
Preferably, the prosody of the input utterance is changed to correspond 
to the prosody of the entire statement. Via the interface 70 the speech 
output is provided to the user at the speech output interconnection 80. 
Typically, a loudspeaker. . . 
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PROCEDE ET SYSTEME D 1 INTERROGATION AUTOMATI QUE 
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Publication Language: English 

Filing Language: English 

Fulltext Word Count: 7592 

Fulltext Availability: 
Detailed Description 
Detailed Description 

... 60 may receive the question/confirmation statements from the dialogue 
manager 50 in various forms, such as a (potentially prosodically 
enriched) textual form or as speech fragments . The speech 
generation subsystem 60 may be based on speech synthesis techniques 
capable of converting text -to-speech . The speech generation subsystem 
60 may itself prosodically enrich the speech fragments or text in 
order to generate more naturally sounding speech. The enriched material 
is then transformed to speech output. Speech 
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SYSTEM AND METHOD FOR DELIVERING ELECTRONIC MESSAGING TO MOBILE PHONES 
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Fulltext Availability: 
Detailed Description 

Detailed Description 

the e-inail notification server; incoming client request logging by 0 
Z11D zl-"~ C. 

the Voice Mail Notification Server; incoming client request lo( ... ing 
by text -to-speech server ; and C Z) . = C incoming call logging by the 
IVR applications. Two logging tables are provided. Each outgoing message 
will be logged in the "Loaging. . . 

...ID; an account IID, a message type identifier (e mail, voice mail, 

warning, response ), and a time stamp. Incoming text-to-speech messages 
from the text -to-speech server will be logged in the "Text -to- 
speech server Logging" table of tile database, which is the secord 
logoring table. Each record includes: a user ID, duration of the 
rnessaGe (in minutes), a message count, and a time stamp. 

The e-mail notification server preferably records a detailed log file. 
This file. . . 
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English Abstract 

...system. The CPR system further includes a message delivery facility 
for delivery of textual messages to any of pager, electronic mail, or 
voice mail (after text -to-speech synthesis) message delivery 
vehicles. The server system accesses a user specific data store 
containing speech synthesis profiles which include prosodic information 
of the voices and speech of users, and message delivery profiles which 
specify which of the aforementioned message delivery vehicles are to be 
used. . . 
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Fulltext Availability: 
Detailed Description 

Detailed Description 

... Unit (ARU) Capabilities 146 1. User Interface 146 L. Message 

Management 149 1. Multiple Media Message Notification 149 2. Multiple 
Media Message Manipulation 150 3. Text to Speech 150 4. Email 
Forwarding to a Fax Machine 151 5. Pager Notification of Messages 
Received 151 6. Delivery Confirmation of Voicemail 151 7. Message 
Prioritization. .. can be made for directory service as well as for 
registration (a one time fee plus a monthly fee), call setup, but 
probably not for duration . Duration is already charged for the 
Internet dial in user and is somewhat bundled for the LAN-attached user 

Usage charges for Internet service may be coming soon (as discussed 
above) . 

Duration charges are possible for the incoming and outgoing PSTN 
segments . 

Incoming PSTN calls may be charged as the long distance segment by usin 
a special . . . 
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SPEC A (English) EPAB96 8640 
Total word count - document A 10376 
Total word count - document B 0 
Total word count - documents A + B 10376 

...SPECIFICATION message of FIG. 4 in audio form to the card owner at 
telephone set 145, for example. Specifically, IVRS 125 is a processor 
that executes text -to-speech synthesis programmed instructions 
designed to use ASCII input, such as one of the messages shown in FIG. 
4, to generate a "read aloud" audio rendition of that ASCII... 
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Detailed Description 

... and one of the recognised town names is tested. If the number is 

manageable, for example if it is three or fewer, the control means 
instructs (25) the speech synthesiser to play an announcement from the 
message data store 3, followed by recitation of the name, address and 
telephone number of each entry, generated by the speech synthesiser 1 
using text -to-speech synthesis, and the process is complete (26) . If, 
on the other hand, the number of entries is excessive then further steps 
27, to be discussed. . . 
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...CLAIMS of speech parameter values are indicative of change to a text to 
speech synthesizer in corresponding acoustical characteristics of 
said base synthesized voice; 

opening a text to speech synthesizer with a command string 
containing command-line arguments, wherein said command-line 
arguments include current present ones of first class speech 
parameter values; 

forming a text string, wherein...' 
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, .♦ SPECIFICATION of a speech synthesis apparatus used in preferred 
embodiments of the present invention. 

In FIG. 25, reference numeral 101 represents a keyboard (KB) for 
inputting text from which speech will be synthesized , a control 
command or the like. The operator can input a desired position on a 
display picture surface of a display unit 108 using a pointing device 
102. . . 
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.SPECIFICATION in script buffer 31 is provided to processor 32 which 
separates the text narration from the multimedia commands and provides 
the text narration to the text -to-speech converter 26. The presence 
of multimedia commands is detected by an action token detector 34 which 
identifies the beginning of each scripting command. The action. . . 

.intended. In general, the multimedia scripting command can include 
commands to incorporate further text files 36 and feed the text from 
those text files to text -to-speech converter 26, commands to obtain 
MIDI files 37 and feed the MIDI music in those files to MIDI synthesizer 

28, commands to obtain bit map image files 39 and to feed the still 
video information in those bit map image files to video monitor 17, 
commands . . . 
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.SPECIFICATION response pair. Here, the operation may not necessarily be 
the speech controlled one. Also, the response is described as a command, 
where "synthO" is a command to output the synthesized speech with 
its argument as the text of the speech output, and "playO" is a 
command to output the data specified by its argument as the waveform 
data. Here, $<cat> in the argument... 
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Verfahren und Einrichtung zur Darstellung von Segmenteinheiten zur 

Text-Sprache-Umsetzung . 
Methode et dispositif pour la representation d 1 unites segmental res pour la 

conversion texte-parole . 

PATENT ASSIGNEE: 

International Business Machines Corporation, (200120), Old Orchard Road, 

Armonk, N.Y. 10504, (US), (applicant designated states: DE; FR; GB) 
IBM SEMEA S.r.l., (1179640), Via Fara, 35, P.O. Box 137, 1-20124 Milan, 
(IT), (applicant designated states: IT) 
INVENTOR: 

Giustiniani, Massimo, Via Carlo Fadda 19, 1-00173 Rome, (IT) 

Pierucci, Piero, Via P. Mengoli 14, 1-00146 Rome, (IT) 
LEGAL REPRESENTATIVE: 

Lettieri, Fabrizio (59683), IBM SEMEA S.p.A., Direzione Brevetti, MI SEG 
534, P.O. Box 137 P.O. Box 137, 1-20090 Segrate (Milano) , (IT) 
PATENT (CC, No, Kind, Date) : EP 515709 Al 921202 (Basic) 
APPLICATION (CC, No, Date): EP 91108575 910527; 
PRIORITY (CC, No, Date) : EP 91108575 910527 
DESIGNATED STATES: DE; FR; GB; IT 
INTERNATIONAL PATENT CLASS: G10L-005/04; 
ABSTRACT WORD COUNT: 132 

LANGUAGE ( Publication, Procedural, Application) : English; English; English 



FULLTEXT AVAILABILITY: 

Available Text Language Update Word Count 

CLAIMS A (English) EPABF1 753 

SPEC A (English) EPABF1 5505 

Total word count - document A 6258 

Total word count - document B 0 

Total word count - documents A + B 6258 



...CLAIMS the determination of the speech feature vectors is obtained 
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taking the feature vectors of said AEHMM corresponding to the most 
probable labels. 

6. A concatenative text -to-speech synthesizer system including a 
Text Input means (100) for entering text to be synthesized, a Text 
Processor (101) for converting the graphemic input into a... 

...feature vectors for said Synthesis Filter (106) and a 

Back-Transformation Processor (SU14) which transforms the domain of 
spectral coefficient representation in order to be directly used by 
said Synthesis Filter (106) . 

7. The text -to-speech synthesizer system of claim 6 in which said 
Segmental Unit Linker (105) includes a Stretch by Copy Processor 
(SU21) producing a sequence of labels with... 

...to phonotactical constraints of the language and a Coefficients 

Back-Tras formation Processor (SU25) which tras forms the domain of 
spectral coefficient representation in order to be directly used by 
said Synthesis Filter (106). 

8. The text-to speech synthesizer system of claim 7 wherein said 
optimality criterion used in said AEHMM Interpolation Processor 
(SU24) consists in... 
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Detailed Description 

Detailed Description 

and affect the safety of the driver as well as those around the 
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driver . 

The art of speech synthesis has seen many improvements, and today text 
to speech converters are commercially available. Cellular telephones 
which supply audio feedback are known . . . feedback of numbers in response 
to keys pressed on a keypad. Another example is U.S. Patent No. 
5,095,503, entitled "CELLULAR TELEPHONE CONTROLLERWITH SYNTHESIZED 
VOICE FEEDBACK FOR DIRECTORY NUMBER CONFIRMATION AND CALL STATUS". 
Speech recognition is another area that now offers commercial solutions 
to those wishing to employ voice commands in a system. . . 
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PROCEDE ET APPAREIL D ! ANALYSE DE REPONSES ELECTROENCEPHALOGRAPHIQUES A 
FACETTES MULTIPLES 
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Main International Patent Class: A61B-005/04; 
Publication Language : English 
Fulltext Word Count: 15100 

Fulltext Availability: 
Claims 

Claim 

... PC could send a command and deassert the task bit lines before the 
robot (which runs very slowly) has had a chance to buffer the command . 

The speech synthesizer system 190 from DEC includes a board for the PC 

and a speaker to place on the robot (DECtalk PC text -to-speech system 

from Digital Equipment Corporation) . Connections were made with a 
standard 
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. . .Inventor: MCALLISTER A I 



Abstract (Basic) : 

... A service control point comprises a database of call processing 

records, for controlling several services along with the central 
office. Signaling messages are communicated between the service 
control point and the central office. An INDEPENDENT CLAIM is also 
included for personalized telecommunication services processing method 



...The service uses speech based identification, thereby eliminating the 
burden on the subscriber to dial in long strings of identifying digits 
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Automated subscriber telephone number providing method - prompting user 
to speak name and location of sought party, and digitising responses 
before feeding them to speech recognition devices, whose outputs are 
used to search database for corresponding number 
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digitising responses before feeding them to speech recognition devices, 
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. . .Inventor: CURRY J E ... 

. . .MCALLISTER A I 

. . .Abstract (Basic) : a telephone user to an automated directory assistance 
station, upon a user dialling a predetermined number on a telephone. 
The user responds to a stored message , by speaking a name of a 
location of a sought subscriber. A second stored message prompts the 
user to speak the last name of the sought subscriber. The responses 
from the user are encoded into first and second digital signals which 
are compatible with two speech recognition devices. The signals are 
transmitted to the speech recognition devices which use word 
recognition and phoneme recognition, respectively. . . 

. . .The output signals from the speech recognition devices are decoded and 
a probability level signal is associated with each decoded signal. The 
probability level signals are combined according to a predetermined. . . 

...signals, associated with the highest probability level are selected. The 
second selected signal is used to obtain a corresponding directory 
number from a database. A message is transmitted to the user, 
articulating the directory number... 

.. .USE/ ADVANTAGE - E.g. for automatic processing of directory assistance 
calls in telecommunication network. Uses available speech recognition 
equipment in unique manner, to attain improved level of effectiveness. 
Minimises necessity to rely on operator intervention. Maximises 
successful provision of required assistance... 

...Title Terms: SPEECH ; 
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Providing subscriber telephone numbers to telephone users - using speech 
recognition to decode area and name prompted from user and articulates 

corresp. code and number retrieved from database 
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using speech recognition to decode area and name prompted from user 
and articulates corresp. code and number retrieved from database 
. .Inventor: CURRY J E . . - 

. -MCALLISTER A I . . - 

. .MCALLISTER A 

. .Abstract (Basic) : The method involves enabling automated station to 
respond to a set dialled number to prompt a caller by a recorded 
message to give a desired location. The response is digitised and 
simultaneously input to word and phoneme recognition devices, which 
each output a translation signal and. . . 
..ADVANTAGE - Efficient. Acceptable and pleasing to user. Uses available 
speech recognition devices. Need for operator intervention minimised 

..Title Terms: SPEECH ; 
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MEADOR Frank E 
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HAYDEN James B 

HANLE John P 
Fulltext Availability: 

Detailed Description 

Claims 

English Abstract 

An automated directory assistance system with voice processing unit for 
use in a telecommunication network includes multiple speech recognition 
devices comprising a word recognition device, a phoneme recognition 
device, and an alphabet recognition device. A caller is prompted to speak 
the city or . . . 

Detailed Description 

connect the telephone to an audio digital interface system and causing 
a first prestored prompt to be provided to the user from system memory. 

This message instructs the user to spell letter-by letter the last name 
of the subscriber whose telephone number is desired. Each time a letter 
of the. . . 

...the number of such matches in reading through the entire main memory to 
be stored in a match counter. 

A selected one of three recorded messages is then transmitted to the 
user with the selected message corresponding to one of four different 
situations . 
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METHOD AND SYSTEM FOR HOME INCARCERATION 

PROCEDE ET SYSTEMS PERMETTANT L 1 INCARCERATION A DOMICILE 
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Detailed Description 

Claims 

English Abstract 

...telephone network including a telephone (58) on the premises of the 
location of confinement and a control center (48) . Voice verification, 
using voice analysis of speech transmitted in a telephone call from the 
site (58) to the center (48) is performed during periodic testing. A 
voice template vocabulary is established for. . . 

Detailed Description 

premises by communicating with the individual via a telephone network, 
identifying the location by utilizing caller line identification and 
identifying the individual by voice identification speech processing. 

Backaround Art 

Theconcept of home incarceration has evolved as an alternative to 
detention in government jail and prison facilities. In cases of 
relatively light ... individual to be verlf ied. Such identification 
attempts likely would not be successful if the system serves a large 
number of detainees or if the speech of the called party is slurred by 
the influence of drug or alcohol abuse. Enforcement personnel frequently 
must be dispatched *to the confinement sites to... 377 contemplates the 
use of a voiceprint as a means for remote identification of a prisoner. 
Audio spectral analysis is performed and -lz applied to speech 
transmitted over a telephone line to determine a match with a 
probationer 1 s voiceprint . 
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' 15/3, K/3 (Item 2 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2000 The Gale Group. All rts . reserv. 

01235466 SUPPLIER NUMBER: 07142699 

An efficient multiplexing technique for packet-switched voice-data 

networks . 
Choi, J.K.; Un, C.K. 

Proceedings of the IEEE, v76, n9, pl254(3) 
Sept, 1988 

ISSN: 0018-9219 LANGUAGE: ENGLISH RECORD TYPE: ABSTRACT 

ABSTRACT: A new computationally-efficient packet-switched voice-data 
network multiplexing technique uses a synchronous frame structure with the 
same duration as the voice packet generation interval to enable the 
synchronous transmission of voice packets without loss or clipping. 
Packets are sequenced through the use of inter-arrival — delay — times and 
sequence numbers. The intervals and active voice periods... 

...derived from the received packet streams through use of the frame 
format. The frame structure is applicable to a variety of enhanced services 
with varying voice transmission rates and packet sizes. 



15/3, K/4 (Item 1 from file: 674) 

DIALOG (R) File 674: Computer News Fulltext 

(c) 2000 IDG Communications. All rts. reserv. 

082505 

Gearhead - Speechifyin 1 software 

Byline: Mark Gibbs 

Journal: Network World Page Number: 46 

Publication Date: March 27, 2000 
Word Count: 4 94 Line Count: 44 

Text : 

... 28, page 48), we discussed a cool utility named Talking Stocks from 
4Developers (www. 4developers.com). Gearhead has been intrigued by software 
that talks since speech - generating chips became available about 15 
years ago. We remember well those distorted robot voices that sounded like 
a tourist from Eastern Europe with a bad. . . 

. . .WillowTalk has a range of predefined voices that imitate male and female 
tonality quite well. You can also define your own voices in terms of pitch 
, speed and volume, and the product allows for custom dictionaries so you 
can define the pronunciation of special words. The reading scripts feature 
is odd, to say the least: You fill in a grid with the voice you want in one 
column and the text for that voice in the other and the voice reads the 
script. One of these days Gearhead plans to create a completely synthesized 
reading of MacBeth (http : //sailor .. . 

. . . speech to a file that lets you include synthesized voices in other 
applications. Another fun speech utility is Saylt from AnalogX 
(www.analogx.com/contents/ download /audio /sayit.htm). AnalogX has a lot 
of public domain software for Windows on its Web site , including 
something called Saylt. Saylt is simple and was designed along the lines of 
Speak ! n Spell. It has a text entry window where you can enter up to 500 
text characters and four sliders that let you change pitch , speed, 
modulation and cascade. (AnalogX omits explaining what these last two 
attributes actually do - get 1 em wrong and the voice can sound awf ul . ) You 
can simply have the text read to you or you can save the synthesized 
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15/3 ,K/1 (Item 1 from file: 647) 

DIALOG (R) File 647: CMP Computer Fulltext 
(c) 2000 CMP. All rts. reserv. 

01124894 CMP ACCESSION NUMBER: CWK19970505S0042 

Street Technologies Paves Way For Sound (Intranet Watch) 

John Evan Frook 

COMMUNICATIONSWEEK, 1997, n 661, PG8 
PUBLICATION DATE: 970505 

JOURNAL CODE: CWK LANGUAGE: English 

RECORD TYPE: Fulltext 

SECTION HEADING: Top of the News 

WORD COUNT: 173 

TEXT: 

Corporate buyers have saved millions purchasing computers devoid of 
sound cards, but now those unhearing machines are useless for delivery 
of audio training materials over corporate intranets. That's the issue 
Street Technologies Inc., White Plains, N.Y., is tackling with its 
StreetSound parallel port sound card. . . 

...sound. Street Technologies CEO Stephen Gott said the card was 
developed to help support efforts of Street's sister company, Learning 
Tree International, which develops text , audio and video 
computer-based training programs at www.learningtree.com. After pitching 
50 different CIOs on developing multimedia training materials for 
intranets, Gott said he found that 90 percent of their installed seats 
had no sound. Street... 

...95 sound card monthly and this week is launching a multimedia help 
desk for StreetSound, accessed via www.streetinc.com, to demonstrate the 
power of Web -training tools. 



15/3, K/2 (Item 1 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

02074428 SUPPLIER NUMBER: 19520438 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

On the Web, voices carry; high quality and low bandwidth. (Voxware) 
(Company Business and Marketing) 

PC Magazine, vl6, nSpeiss, pl2(l) 
Summer, 1997 

ISSN: 0888-8507 LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 266 LINE COUNT: 00024 

with the goal of making Internet telephony a competitive 
alternative to today's telephone network. 

Voxware ! s core technology creates a mathematical model of human 
speech that can then be efficiently delivered over the Internet . Once 
the software models the data, the voice can do all kinds of gymnastics, 
altering pitch , speed, resonance, and other characteristics. In one 
application of this technology, an entertainment company could model an 
actor* s voice for a cartoon character, ensuring that the 
character—complete with computer-generated voice --could outlive the 
actor. Spooky. 

The ability to transform a human voice and then send it over the Web 
raises a whole new area of... 
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voice to disk. Some of the voices Gearhead got out of Saylt were great - 
clear and easily understood . These speaking programs are great fun and can 
be used to generate speech for other application programs or Web 

sites - Ein, zwei, drei, vier. Synthesis to gh@gibbs.com. 



15/3 ,K/5 (Item 1 from file: 148) 

DIALOG (R) File 148: Gale Group Trade & Industry DB 
<c)2000 The Gale Group. All rts . reserv. 

11557320 SUPPLIER NUMBER: 58079713 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Fastcomm Introduces Enhanced Feature Set for the Voice Over Packet 
Marketplace . 

Business Wire, 1331 
Dec 8, 1999 

LANGUAGE: English RECORD TYPE: Full text 

WORD COUNT: 925 LINE COUNT: 00081 

... enhance our reputation with current and potential customers." 

Integrated Call Detail Recording 

The Integrated Call Detail Recording (iCDR) enhancement captures call 
activity details from each site on the network . After a call is 
terminated, the MetroLAN (TM) or GlobalStack (TM) packet voice router 
will generate a message that contains the call details. The captured data 
includes calling and called parties, call duration (to the second), call 
routing information (whether routed over FR or IP) , and disconnection 
reason. 

The iCDR message is routed as an IP packet through... 

o 
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18/3, K/l (Item 1 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2000 The Gale Group. All rts . reserv. 

01549059 SUPPLIER NUMBER: 12972691 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Planning for 1995: the future is now. (technology strategies for the 

future) (overview of four articles on strategic technology 

planning) (Special Report: 1995) 

Battelle, John; Eliot, Lance B. ; Rothfelder, Jeffrey; Steinberg, Don 
Corporate Computing, vl, n6, pl66(15) 
Dec, 1992 

ISSN: 1065-8610 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 7255 LINE COUNT: 00564 

... Ashton-Tate will bring to light the industry's best new acronym: 

BLObs . You can find BLObs (binary large objects; compound objects that can 
contain text , graphics, sound , video, and other information) in 
InterBase, the jewel in Ashton-Tate 1 s software portfolio. InterBase is a 
Unix-based relational database server engine that supports both SQL and 
its own data-manipulation language. Its proprietary language extends the 
standard relational model significantly: it's geared for high... 

...bursts of incoming information such as a sales transaction. Yet it also 
supports the kind of database access typical to a desktop PC user: long- 
duration browsing, editing, and printing of records. Borland calls this 
odd combination on-line complex processing, in which real-time transactions 
can be posted while users . . . 



18/3 ,K/2 (Item 2 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

01548151 SUPPLIER NUMBER: 13229548 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Microsoft goes for hard sell. (Microsoft's marketing strategy for its new 
Access database management system) (PC User News) (Brief Article) 

PC User, nl98, pl7(l) 
Nov 18, 1992 

DOCUMENT TYPE: Brief Article ISSN: 0263-5720 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 300 LINE COUNT: 00023 

Microsoft will be staging road-shows in London, Birmingham, 
Manchester, Edinburgh, Bath and Dublin. Some dealers will be holding their 
own events . 

Microsoft is pitching Access at users with little or no programming 
skills, enabling them to build databases with text , numbers, sound and 
full-motion video. Its GQBE graphical query tool can also analyse data 
between dBase, Paradox, Btrieve and Microsoft SQL Server formats. 

Mike Farrow, a consultant developer with a beta site for Access, 
Channel Business Systems, believes there will be a large market for the 
product. . . 



18/3, K/3 (Item 1 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

03467068 Supplier Number: 47147989 (USE FORMAT 7 FOR FULLTEXT) 

-IBM: IBM and Eloquent Technology Inc. Announce speech technology alliance 

M2 Presswire, pN/A 
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Feb 24, 1997 

Language: English Record Type: Fulltext 
Document Type: News wire; Trade 
Word Count: 651 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 

...IBM and Eloquent Technology Inc. Announce speech technology alliance 
(C) 1994-97 M2 COMMUNICATIONS LTD RDATE:210297 * IBM to license rights to 
ETI's advanced text -to-speech system IBM and Eloquent Technology Inc. 
(ETI) have recently announced in the US that IBM has acquired certain 
exclusive rights to Eloquent f s powerful text -to-speech technology 
system. As part of the agreement, IBM and Eloquent will work closely to 
integrate text -to-speech functions into future IBM products and 
applications that are part of the IBM VoiceType family. ETI will continue 
to license and support its toolkit product... 

...to enhance the consumer* s experience by extending speech technology to 
applications, products and appliances of all shapes and sizes." 
ETI-Eloquence is a flexible text -to-speech system that produces 
high-quality speech with natural sounding intonation. The ETI- Eloquence 
system provides nine built-in voices, including those of adults and 
children, both male and female. Developers and end-users can easily create 
additional voices by controlling such parameters as gender, 
breathiness, roughness, pitch fluctuation and speaking rate. The 
linguistic models and specialised development tools underlying Eloquence 
make it highly extensible and customizable. In addition, the technology 
provides a robust development platform that both IBM and ETI plan to 
exploit as the market for high quality text -to-speech solutions 
continues to develop. "We are very pleased that IBM recognised the 
potential of our technology," said Sue Hertz, Ph.D., president of Eloquent 
Technology. . . 

...a broad variety of speech-enabled products, and will provide users with 
access to many interactive applications that take advantage of a combined 
speech-to-text /text -to-speech product and toolkit." Notes to editor 
Eloquent Technology, Inc. ETI, located in Ithaca, New York, was founded in 
1983 by Sue Hertz, Ph.D., explicitly for the purpose of developing and 
marketing text -to-speech software. ETI has been the recipient of 
numerous government grants and contracts for text -to-speech research 
and development. The first version of Eloquence was released in early 1995. 
ETI-Eloquence is suitable for a wide range of applications, which include 
reading and speaking aids, CD-ROM edutainment products, telephony and 
integrated voice response applications, Internet talking pages, 
information and warning systems, and many others. For more information on 
ETI or its products, call 00 1 607-266- 7025. Internet users can access 
the ETI home page on the World Wide Web at http://www.eloq.com A UK 
English version of ETI-Eloquence will be available later this year. IBM 
Speech Systems IBM, with a family. . . 

...3.0 for Windows 95. * Product and service names are the trademarks or 
registered trademarks of International Business Machines Corporation, or 
their respective owners. For Internet users, IBM offers complete 
information about the company, its products, services and technology on the 
World Wide Web . The IBM VoiceType home page is at 

www.software.ibm.com/is/voicetype. CONTACT: James Lloyd, Charles Barker 
Tel: +44 (0)171 830 8493 Fax... 



18/3 ,K/4 (Item 2 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 



Dialog compft 9139 a c 25723 



AB 2 



Report for SPE Fan Tsang 08/948328 September 27, 2000 11:26 



(c) 2000 The Gale Group. All rts . reserv. 

02683352 Supplier Number: 45442536 (USE FORMAT 7 FOR FULLTEXT) 
EDGE OF CHAOS: Current Perspectives on Interactive Advertising Paul Kagan 
Conference on Interactive Advertising 

Multimedia & Videodisc Monitor, vl3, n4, pN/A 
April, 1995 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 2861 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 

...get the lines in." He called interactivity a marketing discipline — as 
opposed to an advertising or promotional discipline — and offered the 
example of a Godiva Internet site that informs about the "lusiousness 
of chocolate' 1 and also includes an online candy store. Hauptschein 
commented that interactivity must be thought of as a content... 

...3400, 30 South Wacker Drive, Chicago IL 60606, 312/750-5000). * Marty 
Levin (vice president, Microsoft Advanced Technology Division; creative 
director of the pending Microsoft Network ) described the current online 
services market as the "first step up the bandwidth scale," with 
communications being the current killer application. Regarding Microsoft ! s 
business model, he indicated that on Microsoft Network , the information 
providers (as opposed to the service operator) will be making the lion's 
share of the revenue. Levin said, "Today we have connectivity ... t see much 
of it in informercials . " He said that when talent performs, much more 
merchandise is sold than when the person "gets too involved pitching the 
product." Paxton reminded the audience that a telethon (which is long 
program for charity) raises the most money when the performers are on 
stage. Paxton said, "It's time for advertisers to get back to sponsoring 
shows — not just pitching products." He reminded attendees of the day 
when the Texaco Television Theater, hosted by Sid Caesar, "had the Texaco 
Star emblem on screen for over forty minutes of programming time." He said, 
"There is tremendous room for diversification in inf omercials, " which today 
pitch five things: "thinness, muscles, hair, psychics, and finding a 
mate." Supporting Tom Grieb's statement about how interactivity assists in 
local markets, Paxton told the... 

...to "old programmers to create new content." He also admitted that 
interactive tools currently "stink." He predicted that people will soon get 
burned on the Internet , as an average site can handle only three to 
twelve simultaneous callers. He compared the cruise ship industry to the 
online services industry. In surveys of potential cruise ship. . . 
...it again. Leonsis said that Apple Computer's 2 Market shopping service 
on AOL has a $78 average purchase, which is two times Home Shopping 
Network 's average order, and one-and-a-half times the average paper 
catalog order. According to Leonsis, 50,000 hours of online shopping time 
was . . . 

...customer service; and a social dynamic of some kind. For advertising, 
offer 1) robust interactive information, with a real point of difference; 
2) multimedia support — text , graphics, sound , and video; 3) a full 
range of communications options — e-mail, bulletin boards, and chat; and 
4) great customer service (445 Hamilton Avenue, White Plains NY 10601, 
914/448-2496). * Dan Burns (former director of Delphi/ Internet ) said that 
online services are good for providing easy access to "considered" 
purchases, gifts, and transaction-related products like travel and finance. 
He said, "Interactive. . . 
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...a critical mass, and marketers have to find and work with good 
developers who can create the sponsored environments." He opined that 
establishing an unsupported site on the Internet would be "like 
putting a billboard on your lawn, just because there are 100 million cars 
in the US" (1030 Massachusetts Avenue, Cambridge MA 02138... 
...if consumers leave the mass media, advertising as it currently exists 
will disappear. He noted, "Since Great Britain doesn't have commercial 
online services, the Internet is the center of activity." He reported 
that 40 percent of online users (including business services) are women. 
Online Explosion II * Christina Ford (vice president... 

...killer interactive applications only when: 1) a McDonald's download 
doesn't take three hours; 2) the $30,000 you paid to the Hot Wired 
Internet address actually gets some users to register 3) America Online 
can be specific about what you get for $300,000; 4) women dominate; 5) e... 

...of interactive advertising conference sessions; and 7) a Time Warner FSN 
ad doesn't cost a million dollars to reach five homes. Providing advice 
about site creation, she said that "virtual information spaces" require a 
"diction" that prompts repeat usage, so that consumers want to return to 
the application. She warned. . . 



18/3 ,K/5 (Item 3 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

02656409 Supplier Number: 45381467 (USE FORMAT 7 FOR FULLTEXT) 
THIS WEEK ! S LEAD STORY: DYNAMIC ROUTE GUIDANCE DEALT BLOW BY TRIALS OF 
PHILIPS SOCRATES UNIT 

Intelligent Highway, v5, n23, pN/A 
March 6, 1995 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 1302 

... every minute, Biding says. 

This sizeable discrepency required the establishment of a new message 
"filtering process," Biding adds. Travel times for highway links within a 
network allow the unit to calculate routes offering the shortest journey 
time. The guidance instructions are then provided to the driver by 
direction arrows on the in-vehicle unit ! s colour display and through 
speech synthesis instructions - 

This data reception problem is confirmed by offi cials at Volvo, the 
vehicle manufacturer which supplied vehicles for the trial. The "processing 
power of the (in. . . 



18/3, K/6 (Item 4 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

02289348 Supplier Number: 44426085 (USE FORMAT 7 FOR FULLTEXT) 
ND COMTEC BRINGS IN LILLE UNIVERSITY'S PHRASEA MULTIMEDIA ARCHIVING , 
RETRIEVAL SOFTWARE FOR MAC 

Computergram International, n2349, pN/A 
Feb 8, 1994 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 64 8 
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(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 

. ,.S, has been appointed distributor and sole UK maintenance provider for 
Phrasea - a full text indexing multimedia archiving and retrieval software 
program that manages picture, text , video and sound files on the same 
database on the Apple Computer Inc Macintosh. Phrasea automatically indexes 
and stores information to a free text retrieval database so structured. . . 

...by the University of Lille in Paris, the software was launched in France 
in October 1993. In the UK, ND Comtec says it will be pitching the 
product at media support industries, newspaper and publishing companies and 
at local government. Phrasea is available in two versions: Phrasea Agency 
features only the... 

...II is suitable for networks and comprises all the above features. The 
stand-alone database is GBP1,500 and requires 1.5Mb of RAM, the server 
for networking costs GBP1,725 requiring 3Mb of RAM and an additional 128Kb 
for each concurrent user; and the communication server for remote access 
is GBP1,100 needing 3Mb of RAM. Phrasea currently runs only on the Apple 
Macintosh but a Windows version will be available... 



18/3 ,K/7 (Item 1 from file: 148) 

DIALOG (R) File 148: Gale Group Trade & Industry DB 
(c)2000 The Gale Group. All rts . reserv. 

09756784 SUPPLIER NUMBER: 19761691 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Newsroom systems suit up for RTNDA; Windows NT, Web are buzzwords. 
(Radio -Television News Directors Association show) (Special Report: 
Newsroom Systems) 

Dickson, Glen 

Broadcasting & Cable, vl27, n38, p89(4) 
Sep 15, 1997 

ISSN: 1068-6827 LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 3321 LINE COUNT: 00259 

... a stand-alone entity at RTNDA. Open systems, not total systems, 

will be the message in New Orleans for AvidNews, which is designed to 
handle text composition, audio and video browsing, and Web 
publishing . 

"We understand that lots of customers want a newsroom computer as 
well as DNG video gear, and some may not want the DNG stuff. . . 



18/3, K/8 (Item 2 from file: 148) 

DIALOG (R) File 148: Gale Group Trade & Industry DB 
(c)2000 The Gale Group. All rts. reserv. 

07218120 SUPPLIER NUMBER: 14984824 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

THE WORLD 1 S FIRST INTERNET C YBERS TATI ON TO BROADCAST FROM NETWORLD+INTEROP 
94 

PR Newswire, p0407SF007 
April 7, 1994 

LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 64 8 LINE COUNT: 00057 

forum for addressing the networking interoperability challenges and 
solutions found in the real world of enterprise computing. The CyberStation 
will highlight "demonstrated convergence 11 of voice, text , sound and 
image technologies on ' the Internet . The programming includes world news, 
technical forums and music to be cybercast for the duration of the 



Dialog compft 9139 a c 25723 



AB 5 



Report for SPE Fan Tsang 08/948328 September 27, 2000 11:26 



NetWorld+Interop exhibition. Other highlights from the cybercast include 
live popular mainstream broadcast programs from National Public Radio (NPR) 
and news reporting by. . . 
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22/3 ,K/1 (Item 1 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2000 The Gale Group. All rts . reserv. 

02048879 SUPPLIER NUMBER: 19244117 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Voice processing: Bell Labs launches Web site for text-to-speech synthesis, 
(up to nine different languages) (Company Business and Marketing) 

EDGE, on & about AT&T, vl2, p23(l) 
March 10, 1997 

LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 1150 LINE COUNT: 00098 

. . . Romanian. 

SPEAKING ON THE WEB Visitors to the Bell Labs Text-to-Speech Synthesis 
Web site at http://www.bell-labs.com/project/tts/, can sample speech in 
up to nine different languages, as well as visit a demonstration area that 
allows users to synthesize English, German, and Mandarin Chinese 
sentences using male, female, and child intonations with effects such as 
raspiness. The site offers the experience of high-quality interactive, 
on-the-fly modifications of voice samples . 

The Bell Labs TTS system even handles German noun compounds, which are 
notorious for being long and complex, and which cannot be prestored in a... 

22/3, K/2 (Item 1 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

03513997 Supplier Number: 47259439 (USE FORMAT 7 FOR FULLTEXT) 
NEW ON THE WEB THIS MONTH. . . LUCENT TECHNOLOGIES 

Internet Business News, pN/A 
April 1, 1997 

Language: English Record Type: Fulltext 
Document Type: Magazine/ Journal ; Trade 
Word Count: 64 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT : 

LUCENT TECHNOLOGIES has introduced its Bell Labs Text -to-Speech web 
site located at http://www.bell-labs.com/project/tts designed to allow 
visitors to product natural speech in several languages directly from 
written text. As well as this, users will be able to visit the 
demonstration section which enables them to synthesize English sentences 
using either male, female or child intonations . 



22/3, K/3 (Item 2 from file: 636) 

DIALOG(R) File 636:Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts. reserv. 

03486605 Supplier Number: 47189608 (USE FORMAT 7 FOR FULLTEXT) 
NEW ON THE WEB: LUCENT TECHNOLOGIES 

Telecomworldwire, pN/A 
March 7, 1997 

Language: English Record Type: Fulltext 
Document Type: Newsletter; Trade 
Word Count: 64 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 
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LUCENT TECHNOLOGIES has introduced its Bell Labs Text -to-Speech web 
site located at http://www.bell-labs.com/project/tts designed to allow 
visitors to product natural speech in several languages directly from 
written text. As well as this, users will be able to visit the 
demonstration section which enables them to synthesize English sentences 
using either male, female or child intonations . 



22/3, K/4 (Item 3 from file: 636) 

DIALOG (R) File 636: Gale Group Newsletter DB(TM) 
(c) 2000 The Gale Group. All rts . reserv. 

03485715 Supplier Number: 47187553 (USE FORMAT 7 FOR FULLTEXT) 
LUCENT TECHNOLOGIES: Bell Labs launches Web site for Text to Speech 
synthesis 

M2 Presswire, pN/A 
March 6, 1997 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 826 

Romanian . 

SPEAKING ON THE WEB Visitors to the Bell Labs Text-to-Speech Synthesis 
Web site at http://www.bell-labs.com/project/tts/ can sample speech in 
up to nine different languages, as well as visit a demonstration area that 
allows users to synthesize English sentences using male, female, and 
child intonations with effects such as raspiness. The site offers the 
experience of high-quality interactive, on-the-fly modifications of voice 
samples . 

The Bell Labs TTS system even handles German noun compounds, which are 
notorious for being long and complex, and which cannot be prestored in a... 



22/3, K/5 (Item 1 from file: 148) 

DIALOG (R) File 14 8: Gale Group Trade & Industry DB 
(c)2000 The Gale Group. All rts. reserv. 

09333788 SUPPLIER NUMBER: 19183995 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Bell Labs Launches Web Site For Text- To -Speech Synthesis . 

Business Wire, p3051056 
March 5, 1997 

LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 127 0 LINE COUNT: 00112 

. . . Romanian . 

SPEAKING ON THE WEB 

Visitors to the Bell Labs Text-to-Speech Synthesis Web site at 
http://www.bell-labs.com/project/tts/, can sample speech in up to nine 
different languages, as well as visit a demonstration area that allows 
users to synthesize English, German, and Mandarin Chinese sentences 
using male, female, and child intonations with effects such as raspiness. 
The site offers the experience of high-quality interactive, on-the-fly 
modifications of voice samples . 

The Bell Labs TTS system even handles German noun compounds, which 
are notorious for being long and complex, and which cannot be prestored in 
a . . . 
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File 2:INSPEC 1969-2000/Sep W4 

(c) 2000 Institution of Electrical Engineers 
File 6:NTIS 1964-2000/Oct W3 

Comp&distr 2000 NTIS, Intl Cpyrght All Right 
File 8:Ei Compendex(R) 1970-2000/Aug W4 

(c) 2000 Engineering Info. Inc. 
File 14:Mechaiucal Engineering Abs 1973-2000/Sep 

(c) 2000 Cambridge Sci Abs 
File 65:Inside Conferences 1993-2000/Sep W4 

(c) 2000 BLDSC all rts. reserv. 
File 77:Conference Papers Index 1973-2000/Jul 

(c) 2000 Cambridge Sci Abs 
File 94:JICST-EPlus 1985-2000/May W3 

(c)2000 Japan Science and Tech Corp(JST) 
File 99:Wilson Appl. Sci & Tech Abs 1983-2000/Aug 

(c) 2000 The HW Wilson Co. 
File 108:Aerospace Database 1962-2000/Sep 

(c) 2000 AIAA 
File 144:Pascal 1973-2000/Sep W4 

(c) 2000 INIST/CNRS 
File 233:Internet & Personal Comp. Abs. 1981-2000/Sep 

(c) 2000 Info. Today Inc. 
File 238: Abs. in New Tech & Eng. 1981-2000/Sep 

(c) 2000 Reed-Elsevier (UK) Ltd. 
File 34:SciSearch(R) Cited Ref Sci 1990-2000/Sep W3 

(c) 2000 Inst for Sci Info 
File 434:SciSearch(R) Cited Ref Sci 1974-1989/Dec 

(c) 1998 Inst for Sci Info 

Set Items Description 

51 3551 ((TEXT? ?(2W) (SPEECH OR VOICE)))(5N) SYSTEM? ? OR TTS 

52 2 145 TEXT? ? (2N)(TRANSFORM? OR CONVERT? OR CONVERSION? OR SYNT- 

HES? OR (CHANGE? OR TURN?)(2N)INTO)(5N) (SOUND OR AUDIO? OR V- 
OICE? OR SPEECH) 

53 1 7908 (SPEECH OR VOICE) (2N) (SYNTHES? OR GENERAT?) 

54 20608 SI OR S2 OR S3 

55 61 S4 AND ((WEB OR NETWORK OR W3 OR INTERNET OR INTRANET)(5N> 
(SERVER? OR SITE?) OR WEB() PAGE?) 

56 85 AUDIO(2W)(WAVEFORM? OR WAVE()FORM?) 

57 404 DIGIT?()SEQUENC? 

58 13532 (PROSOD? OR ACCENTUAT? OR INTONATION?) 

59 9333 CONCATENAT? 

5 10 324 1 (SPEECH? OR SOUND? OR VOICE)(2N) (FRAGMENT? OR SAMPL?) 

511 7340 SYLLABLE? 

512 1282 ((NATURAL OR fflGH()QUALITY)(3N) (SOUND? OR SPEECH?)) (10- 
N)(SYNTHES? OR GENERAT?) 

5 13 5 10545 (PITCH? OR DURATION OR APTITUDE OR (ATTACK OR DECAY)(2N) E- 

NVELOP? OR (SYNTHES? () INSTRUCT?) ) 

514 145197 (WAVEFORM? OR WAVEOFORM? ?) 

515 828433 USER? OR CUSTOMER? OR CLIENT? OR SUBSCRIB? 

516 3 S5ANDS6:S14 

517 3495 TEXT? ?(2W) (SPEECH OR VOICE) 

518 2574141 (WEB OR NETWORK OR W3 OR INTERNET OR INTRANET OR SERVER? OR 

SITE? OR WEB() PAGE?) 

519 690 (S10RS17) AND S18 

520 2434882 (SYNTHESIZ? OR GENERAT?) 

521 138 S19 AND S20 
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522 6 S21ANDS5:S6 

523 35161 (ROUT? OR DELIVER? OR SEND OR SENT OR TRANSMIT? OR TRANS- 

MIS? OR PASS? OR REMIT? OR DOWNLOAD)(5N) (AUDIO OR SPEECH OR 
VOICE OR SOUND) 

524 49 S19 AND S23 

525 29 S24 AND S15 

526 23 RD (unique items) 
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22/3, K/l (Item 1 from file: 2) 

DIALOG (R) File 2: INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts . reserv. 

6628938 INSPEC Abstract Number: C2000-08-5260S-007 
Title: Slovene Interactive Text-to- Speech Evaluation Site- SITES 
Author(s): Gros, J.; Mihelic, F. ; Pavesic, N. 

Author Affiliation: Fac. of Electr. Eng., Ljubljana Univ., Slovenia 

Conference Title: Text, Speech and Dialogue. Second International 
Workshop, TDS'99. Proceedings (Lecture Notes in Artificial Intelligence 
Vol.1692) p. 223-8 

Editor (s): Matousek, V.; Mautner, P.; Ocelikova, J.; Sojka, P. 

Publisher: Springer-Verlag, Berlin, Germany 

Publication Date: 1999 Country of Publication: Germany xi+396 pp. 

ISBN: 3 540 66494 7 Material Identity Number: XX-1999-03149 

Conference Title: Text, Speech and Dialogue. Second International 
Workshop, TSD ! 99. Proceedings 

Conference Date: 13-17 Sept. 1999 Conference Location: Plzen, Czech 
Republic 

Language : English 

Copyright 2000, IEE 

Title: Slovene Interactive Text-to- Speech Evaluation Site- SITES 
Abstract: The Slovene Interactive Text -to-Speech Evaluation Site ( 

SITES ) was built according to standards for interactive speech 

synthesizer comparison sites as set by COCOSDA (International Committee 
for the Co-ordination and Standardization of Speech Databases and 
Assessment Techniques for Speech Input/Output) and the LDC (Linguistic Data 
Consortium) . SITES aims to give interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 

SITES Web site enables us to evaluate the S5 Slovene TTS system 

either interactively or off-line by sending the synthesized speech file 
to a given E-mail address. We implemented various standard text selection 
methods and set up rules for construction as semantically unpredictable 
sentences for the Slovene language. The evaluation Web site has the 
capability to accept arbitrary input text, and returns a speech file. A CGI 
script first reads the user's form input. When the user submits the form, 
the script receives the form data as a set of name-value pairs, which is 
parsed. In the CGI script, the TTS system is called with the parameters 
specified by the user. The TTS system generates a temporal audio file 
which is sent back to the user. 

. . . Descriptors : speech synthesis 

Identifiers: Slovene Interactive Text -to-Speech Evaluation Site ; 
SITES ; ... 

. . .Web site ; ... 

. . . S5 Slovene TTS system... 

-.-synthesized speech file 



22/3, K/2 (Item 2 from file: 2) 

DIALOG (R) File 2: INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts. reserv. 

6482299 INSPEC Abstract Number: B2000-03-6130E-008 , C2000-03-5260S-007 
Title: SITES: Slovene Interactive Text-to- Speech Evaluation Site 
Author(s): Gros, J.; Mihelic, F. ; Pavesic, N . 

Author Affiliation: Fac. of Electr. Eng., Ljubljana Univ., Slovenia 
Conference Title: ISIE '99. Proceedings of the IEEE International 
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Symposium on Industrial Electronics (Cat. No . 99TH8465) Part vol.1 p. 
213-16 vol.1 

Publisher: IEEE, Piscataway, NJ, USA 

Publication Date: 1999 Country of Publication: USA 3 vol. xxiii+1568 
pp. 

ISBN: 0 7803 5662 4 Material Identity Number: XX-1999-00564 

U.S. Copyright Clearance Center Code: 0 7803 5662 4/99/$10.00 

Conference Title: Proceedings of ISIE ! 99. IEEE International Symposium 
on Industrial Electronics 

Conference Sponsor: IEEE Ind. Electron. Soc. ; Slovenia Minstr. Sci. & 
Technol.; Soc. Instrum. & Control Eng. (Japan); Univ. Maribor; Univ. 
Ljubljana; IEEE Region 8, Slovenia Sect 

Conference Date: 12-16 July 1999 Conference Location: Bled, Slovenia 

Language: English 

Copyright 2000, IEE 
Title: SITES: Slovene Interactive Text-to- Speech Evaluation Site 

Abstract: The Slovene Interactive Text -to-Speech Evaluation site ( 
SITES ) was built according to standards for interactive speech 
synthesiser comparison sites as set by COCOSDA (International Committee 
for ■ the Co-ordination and Standardization of Speech Databases and 
Assessment Techniques for Speech Input/Output) and the LDC (Linguistic Data 
Consortium) . SITES aims to give the interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 

SITES Web site enables to evaluate the S5 Slovene TTS system either 
interactively or off-line by sending the synthesized speech file to a 
given e-mail address. We implemented various standard text selection 
methods and set up rules for construction as semantically unpredictable 
sentences for the Slovene language. The evaluation Web site has the 
capability to accept arbitrary input text, and returns a speech file. A CGI 
script first reads the user's form input. When the user submits the form, 
the script receives the form data as a set of name-value pairs, which is 
parsed. In the CGI script, the TTS system is called with the parameters 
specified by the user. The TTS system generates a temporal audio file 
which is sent back to the user. 

...Descriptors: speech synthesis ; 

Identifiers: Slovene Interactive Text -to-Speech Evaluation Site ; 
SITES ; ... 

...interactive speech synthesiser comparison sites ; ... 
. . -text -to-speech system ; Web site ; ... 
...S5 Slovene TTS system... 
...synthesized speech file 



22/3 ,K/3 (Item 1 from file: 8) 

DIALOG (R) File 8 : Ei Compendex(R) 

(c) 2000 Engineering Info. Inc. All rts. reserv. 

05476840 E.I. No: EIP00025025492 
Title: SITES: Slovene interactive text-to- speech evaluation site 
Author: Gros, Jerneja; Mihelic, France; Pavesic, Nikola 
Corporate Source: Univ of Ljubljana, Ljubljana, Slovenia 

Conference Title: Proceedings of the 1999 IEEE International Symposium on 
Industrial Electronics (ISIE 1 99) 

Conference Location: Bled, Slovenia Conference Date: 19990712-19990716 
E.I. Conference No.: 55896 

Source: IEEE International Symposium on Industrial Electronics v 1 1999. 
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p 213-216 

Publication Year: 1999 
CODEN: 85PTAR 
Language: English 

Title: SITES: Slovene interactive text-to- speech evaluation site 
Abstract: The Slovene Interactive Text -to-Speech Evaluation Site ( 
SITES ) was built according to standards for interactive speech 
synthesiser comparison sites as set by COCOSDA (International Committee 
for the Co-ordination and Standardization of Speech Databases and 
Assessment Techniques for Speech Input/Output) and the LDC (Linguistic Data 
Consortium) . SITES aims to give the interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 
SITES web site enables to evaluate the S5 Slovene TTS system either 
interactively or off-line by sending the synthesized speech file to a 
given e-mail address. We implemented various standard text selection 
methods and set up rules for construction as Semantically Unapredictable 
Sentences for the Slovene language. The evaluation web site has the 
capability to accept arbitrary input text, and returns a speech file. A CGI 
script first reads the user's form input. When the user submits the form, 
the script receives the form data as a set of name-value pairs, which is 
parsed. In the CGI script, the TTS system is called with the parameters 
specified by the user. The TTS system generates a temporal audio file 
which is sent back to the user. (Author abstract) 15 Ref s . 

Descriptors: Speech synthesis ; Interactive computer systems; Speech 
intelligibility; World Wide Web ; Electronic mail; User interfaces; Speech 
recognition 

Identifiers: Slovene interactive text to speech evaluation site ; 
Semantically unapredictable sentence 



22/3, K/4 (Item 1 from file: 144) 

DIALOG (R) File 144: Pascal 

(c) 2000 INIST/CNRS. All rts . reserv. 

14317779 PASCAL No. : 99-0525224 
Slovene Interactive Text-to- Speech Evaluation site - SITES 
TSD ! 99 : text, speech and dialogue : Plzen, 13-17 September 1999 

GROS J; MIHELIC F; PAVESIC N 

MATOUSEK Vaclav, ed; MAUTNER Pavel, ed; OCELIKOVA Jana, ed; SOJKA Petr, 

ed 

University of Ljubljana, Faculty od Electrical Engineering, Trzaska 25, 
1000 Ljubljana, Slovenia 

Text, speech and dialogue. International workshop, 2 (Plzen CZE) 
1999-09-13 

Journal: Lecture notes in computer science, 1999, 1692 223-228 
Language: English 

Copyright (c) 1999 INIST-CNRS. All rights reserved. 

Slovene Interactive Text-to- Speech Evaluation site - SITES 

TSD ! 99 : text, speech and dialogue :. Plzen, 13-17 September 1999 

The Slovene Interactive Text -to-Speech Evaluation Site (SITES ) 
was built according to standards for interactive speech synthesiser 
comparison sites as set by COCOSDA (International Committee for the 
Co-ordination and Standardization of Speech Databases and Assessment 
Techniques for Speech Input/Output) and the LDC (Linguistic Data 
Consortium) . SITES aims to give the interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 
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SITES web site enables to evaluate the S5 Slovene TTS system either 
interactively or off-line by sending the synthesized speech file to a 
given e-mail address. We implemented various standard text selection 
methods and set up rules for construction as Semantically Unapredictable 
Sentences for the Slovene language. The evaluation web site has the 
capability to accept arbitrary input text, and returns a speech file. A CGI 
script first reads the user's form input. When the user submits the form, 
the script receives the form data as a set of name-value pairs, which is 
parsed. In the CGI script, the TTS system is called with the parameters 
specified by the user. The TTS system generates a temporal audio file 
which is sent back to the user. 

English Descriptors: Speech synthesis ; Interactive system; Slovenian 

French Descriptors : Synthese parole; Systeme conversationnel ; Slovene; 
Web site ; Text -to-speech synthesis 



22/3, K/5 (Item 1 from file: 233) 

DIALOG (R) File 233: Internet & Personal Comp. Abs . 
(c) 2000 Info. Today Inc. All rts . reserv. 

00580284 00CX03-002 

CT boards 1 new IP challenge 

Grigonis, Richard 

Computer Telephony , March 1, 2000 , v8 n3 pl20-144, 16 Page(s) 
ISSN: 1072-1711 

. . . new IP networks encourage the distribution of telephony resources 
across the LAN or WAN, making it practical to house media processing (voice 
compression, DTMF detection/ generation , TTS , ASR) in a different 

server from network interface resources. Notes that along with board 
components, overall PC systems and CPUs have become more powerful, allowing 
many CT resource boards to be linked together. Warns that PC-based voice 
resource vendors might be flanked by data product makers who bolt voice 
processing into routers and other network components. Adds that telecom 
equipment manufacturers are worried. Describes products from several 
vendors, along with other ways of accomplishing convergence. Includes nine 
photos . (KMD) 



22/3 ,K/6 (Item 2 from file: 233) 

DIALOG (R) File 233: Internet & Personal Comp. Abs. 
(c) 2000 Info. Today Inc. All rts. reserv. 

00513254 98IT11-038 

CARL's Kid f s Catalog moves to the Web 

Information Today , November 1, 1998 , vl5 nlO p52, 1 Page(s) 

ISSN: 8755-6286 

Company Name: CARL 

URL: http://www.carl.org 

Product Name: Kid's Catalog Web 

CARL 1 s Kid's Catalog moves to the Web 

Product Name: Kid's Catalog web 

Announces the planned release of Kid's Catalog Web by the CARL 
Corporation of Denver, CO (888, 303) . Says that the product, now under 
development, will offer stronger educational and curricular aid with built 



. . . be more interactive, enabling users to take notes, compile research 
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bibliographies, use collaborative learning tools, and publish their 
projects. Says that full text, online encyclopedias, Web sites , and 
reference tools will be integrated into the product, which will also be 
compatible with text -to-speech synthesizers . Also indicates that it 
will support Unicode characters in MARC records, making it translatable 
into any language. (JC) 

Descriptors: Children; Reference; Catalog; Online Information; 
Information Services; Speech Synthesis 

Identifiers: Kid f s Catalog Web ; CARL 

9 
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26/3, K/l (Item 1 from file: 2) 

DIALOG (R) File 2: INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts. reserv. 

6628938 INSPEC Abstract Number: C2000-08-5260S-007 
Title: Slovene Interactive Text-to- Speech Evaluation Site- SITES 
Author (s): Gros, J.; Mihelic, F. ; Pavesic, N. 

Author Affiliation: Fac. of Electr. Eng., Ljubljana Univ., Slovenia 

Conference Title: Text, Speech and Dialogue. Second International 
Workshop, TDS f 99. Proceedings (Lecture Notes in Artificial Intelligence 
Vol.1692) p. 223-8 

Editor(s): Matousek, V.; Mautner, P.; Ocelikova, J.; Sojka, P. 

Publisher: Springer-Verlag, Berlin, Germany 

Publication Date: 1999 Country of Publication: Germany xi+396 pp. 

ISBN: 3 540 66494 7 Material Identity Number: XX-1999-03149 

Conference Title: Text, Speech and Dialogue. Second International 
Workshop, TSD'99. Proceedings 

Conference Date: 13-17 Sept. 1999 Conference Location: Plzen, Czech 
Republic 

Language: English 

Copyright 2000, I EE 

Title: Slovene Interactive Text-to- Speech Evaluation Site- SITES 
Abstract: The Slovene Interactive Text -to-Speech Evaluation Site ( 

SITES ) was built according to standards for interactive speech 
synthesizer comparison sites as set by COCOSDA (International Committee 
for the Co-ordination and Standardization of Speech Databases and 
Assessment Techniques for Speech Input/Output) and the LDC (Linguistic Data 
Consortium) . SITES aims to give interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 

SITES Web site enables us to evaluate the S5 Slovene TTS system 

either interactively or off-line by sending the synthesized speech file to 
a given E-mail address. We implemented various standard text selection 
methods and set up rules for construction as semantically unpredictable 
sentences for the Slovene language. The evaluation Web site has the 
capability to accept arbitrary input text, and returns a speech file. A CGI 
script first reads the user 's form input. When the user submits the 
form, the script receives the form data as a set of name-value pairs, which 
is parsed. In the CGI script, the TTS system is called with the 
parameters specified by the user . The TTS system generates a temporal 

audio file which is sent back to the user . 
Identifiers: Slovene Interactive Text -to-Speech Evaluation Site ; 
SITES ; ... 

. . -Web site ; ... 

. . .S5 Slovene TTS system 



26/3, K/2 (Item 2 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts. reserv. 

6497043 INSPEC Abstract Number: B2000-03-6210R-021 , C2000-03-6130M-010 
Title: COKBA-based multimedia audio chat 
Author (s): Cimpu, V.F.; Ionescu, D.; Vieru, V.; Cimpu, M. 

Author Affiliation: Sch. of Inf. Technol . & Eng., Ottawa Univ., Ont . , 
Canada 

Conference Title: Engineering Solutions for the Next Millennium. 1999 
IEEE Canadian Conference on Electrical and Computer Engineering (Cat. 
No.99TH8411) Part vol.1 p. 342-5 vol.1 
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Editor (s) : Meng, M. 

Publisher: IEEE, Piscataway, NJ, USA 

Publication Date: 1999 Country of Publication: USA 3 vol. 

(xxiii+1758) pp. " 

ISBN: 0 7803 5579 2 Material Identity Number: XX-1999-02278 

U.S. Copyright Clearance Center Code: 0 7803 5579 2/99/$10.00 

Conference Title: Engineering Solutions for the Next Millennium. 1999 
IEEE Canadian Conference on Electrical and Computer Engineering 

Conference Date: 9-12 May 1999 Conference Location: Edmonton, Alta., 
Canada 

Language: English 

Copyright 2000, IEE 

Abstract: This paper presents a chat application that uses CORBA Event 
and Naming services for communication between users and Microsoft text 
-to-speech engines to speak the messages. Users can choose the computer 
voice, which will represent them during the chat, by selecting the gender, 
speed and pitch of the t ext -to-sp eech engine. After connecting to a 

server , users can c r e a t e~~n e w r o oms^o r — b r o wse_t h £ existing ones. Before 
joining a room, a user can retrieve other participants' pictures or 
samples of their real voices. One important feature is the absence of a 
dedicated chat server , which has been replaced by the CORBA Event and 
Naming services. This allows each host, on which the two CORBA services are 
running, to be used as a chat server . An open message-passing based 
solution assures synchronization between users , as well as the 

transmission of chat messages. A new Audio Chat Communication Protocol 
(ACCP) has been designed for this purpose. 
...Identifiers: text -to-speech engine... 

. . . chat server ; 



26/3, K/3 (Item 3 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts . reserv. 

6482299 INSPEC Abstract Number: B2000-03-6130E-008 , C2000-03-5260S-007 
Title: SITES: Slovene Interactive Text-to- Speech Evaluation Site 
Author (s): Gros, J.; Mihelic, F.; Pavesic, N. 

Author Affiliation: Fac. of Electr. Eng., Ljubljana Univ., Slovenia 
Conference Title: ISIE '99. Proceedings of the IEEE International 

Symposium on Industrial Electronics (Cat. No.99TH8465) Part vol.1 p. 

213-16 vol.1 

Publisher: IEEE, Piscataway, NJ, USA 

Publication Date: 1999 Country of Publication: USA 3 vol. xxiii+1568 
pp. 

ISBN: 0 7803 5662 4 Material Identity Number: XX-1999-00564 

U.S. Copyright Clearance Center Code: 0 7803 5662 4/99/$10.00 

Conference Title: Proceedings of ISIE '99. IEEE International Symposium 
on Industrial Electronics 

Conference Sponsor: IEEE Ind. Electron. Soc. ; Slovenia Minstr. Sci. & 
Technol.; Soc. Instrum. & Control Eng. (Japan); Univ. Maribor; Univ. 
Ljubljana; IEEE Region 8, Slovenia Sect 

Conference Date: 12-16 July 1999 Conference Location: Bled, Slovenia 

Language: English 

Copyright 2000, IEE 
Title: SITES: Slovene Interactive Text-to- Speech Evaluation Site 

Abstract: The Slovene Interactive Text -to-Speech Evaluation Site ( 
SITES ) was built according to standards for interactive speech 
synthesiser comparison sites as set by COCOSDA (International Committee 
for the Co-ordination and Standardization of Speech Databases and 
Assessment Techniques for Speech Input/Output) and the LDC (Linguistic Data 
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Consortium) . SITES aims to give the interested listeners a thorough and 
honest impression of the current text -to-speech (TTS ) system and 
provides valuable feedback about strong and weak points of the system. The 

SITES Web site enables to evaluate the S5 Slovene TTS system either 
interactively or off-line by sending the synthesized speech file to a given 
e-mail address. We implemented various standard text selection methods and 
set up rules for construction as semantically unpredictable sentences for 
the Slovene language. The evaluation Web site has the capability to 
accept arbitrary input text, and returns a speech file. A CGI script first 
reads the user f s form input. When the user submits the form, the script 
receives the form data as a set of name-value pairs, which is parsed. In 
the CGI script, the TTS system is called with the parameters specified by 
the user - The TTS system generates a temporal audio file which is 

sent back to the user . 
Identifiers: Slovene Interactive Text -to-Speech Evaluation Site ; 
SITES ; ... 

...interactive speech synthesiser comparison sites ; ... 
. . . text -to-speech system ; Web site ; ... 
...S5 Slovene TTS system 



26/3, K/4 (Item 4 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts . reserv. 

5958775 INSPEC Abstract Number: B9808-6210R-023, C9808-5620W-019 

Title: Multimedia digital community: a Web -based multimedia 
collaboration system 

Author (s): Bisdikian, C; Brady, S.; Doganata, Y.N.; Foulger, D.; 
Marconcini, F . ; Mourad, M. ; Operowsky, H.L.; Pacifici, G. ; Tantawi, A.N. 

Author Affiliation: IBM Thomas J. Watson Res. Center, Yorktown Heights, 
NY, USA 

Conference Title: Fourth IEEE Workshop on High-Perf ormance Communication 
Systems (HPCS'97) p. 57-62 

Publisher: HPCS'97 Organization Committee, Chalkidiki, Greece 
Publication Date: 1997 Country of Publication: Greece 244 pp. 
Material Identity Number: XX97-01491 

Conference Title: Proceedings of Fourth Workshop on the Architecture and 
Implementation of High Performance Communications Subsystems - HPCC'97 

Conference Date: 23-25 June 1997 Conference Location: Chalkidiki, 
Greece 

Language: English 

Copyright 1998, I EE 

Title: Multimedia digital community: a Web -based multimedia 
collaboration system 

...Abstract: associates on-line. The problem in achieving practical and 
marketable computer-based multimedia collaboration systems, we believe, has 
been a lack of standards for non-voice multimedia content delivery and 
interaction. However, with the growing usage of the hypertext markup 
language (HTML) in preparing and linking information on the World-Wide Web 
, a practical base for building standard and broadly available multimedia 
collaboration solutions is now possible. Realizing that a standards-based, 

Web -enabled conferencing solution could be possible, the idea of a 
multimedia digital community (MMDC) was conceived with the objective of 
marrying the desire for on-line interaction and collaboration using text , 
graphics, and voice communications, with the user -friendliness and 
pervasiveness of Web -based multimedia browser interfaces. MMDC is a 

client /server collaborative solution that has been guided by the need 
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to develop a system that is open (standards-oriented), platform independent 
with low barriers of use on the client side, and easily migratable onto 
different platforms and scalable on the server side. 
...Descriptors: client -server systems... 

. . - Internet ; 

...Identifiers: Web -based multimedia collaboration system... 

. . .World-Wide Web ; Web -enabled conferencing. . . 
...client /server architecture 
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03319255 INSPEC Abstract Number: B89020168, C89015337 

Title: Comprehensive radiology imaging network: clinical and operational 
impact 

Author (s): Mun, S.K.; Ingeholm, M.L.; Horii, S.; Albers, B.D. 

Author Affiliation: Dept. of Radiol., Georgetown Univ. Hospital, 
Washington, DC, USA 

Conference Title: Electronic Imaging '88: International Electronic 
Imaging Exposition and Conference. Advance Printing of Paper Summaries 
p. 134-8 vol.1 

Publisher: Inst. Graphic Commun, Waltham, MA, USA 

Publication Date: 1988 Country of Publication: USA 2 vol. 

xxxviii+1272 pp. 

Conference Sponsor: Diagnostic Imaging Magazine; ESD : Electron . Syst. 
Design Magazine; et al 

Conference Date: 3-6 Oct. 1988 Conference Location: Boston, MA, USA 
Language: English 

Title: Comprehensive radiology imaging network: clinical and operational 
impact 

...Abstract: medical radiologists. In order to test the technical and 
clinical merit of a functioning IMACS system, Georgetown University has 
begun the installation of a comprehensive network based on AT&T's Comm 
View System. The Georgetown project is focused on system integration, 
comprehensive implementation, and diversified users 1 operation. A 
comprehensive network consists of the following groups: input points: 
where text and images initially enter the system; user workstations: 
where images are reviewed and reports are generated; communications 

network : consists of image, text , and voice transmission ; and data 
storage and database: intermediate data storage and archive devices. 

Identifiers: comprehensive radiology imaging network ; ... 

. . .comprehensive network ; ... 
. . -user workstations. . . 

. . . communications network ; voice transmission ; 



26/3, K/6 (Item 6 from file: 2) 

DIALOG ( R) File 2 : INSPEC 

(c) 2000 Institution of Electrical Engineers. All rts. reserv. 
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Title: Voice and text messaging-a concept to integrate the services of 
telephone and data networks 
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Author(s): Lee, L.-s.; Oun-young, M. 

Author Affiliation: Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, 
Taiwan 

Conference Title: IEEE International Conference on Communications '88: 
Digital Technology - Spanning the Universe. Conference Record (Cat. 
No.88CH2538-7) p. 408-12 vol.1 

Publisher: IEEE, New York, NY, USA 

Publication Date: 1988 Country of Publication: USA 3 vol. xxx+1783 
pp. 

U.S. Copyright Clearance Center Code: CH2538-7/88/0000-0408$01 . 00 
Conference Sponsor: IEEE 

Conference Date: 12-15 June 1988 Conference Location: Philadelphia, 
PA, USA 

Language : English 

...Abstract: a voice and text messaging (VTM) system, which can integrate 
the distinct services of the telephone and data networks very quickly. In 
Taiwan the telephone network has very wide coverage and a large number of 

users , while the data network has very limited number of subscribers , 
because they have to possess a terminal. The core of VTM described is a 
Chinese text -to- speech system which can transform any Chinese text 
processed in the data network into Mandarin voice for transmission 

over the telephone network . The telephone network users can key in 
their instructions such as choice of information, text processing, forward 
and backward skipping by pressing the touch-tone buttons of the telephone 
set. The electronic mail and database information services provided by the 
data network therefore become a portion of the voice mail and message 
services provided by the telephone network . A large number of telephone 

network users , even without a terminal, can be served by both networks. 
.". .Identifiers: network integration. . . 

...Chinese text -to-speech system ; 
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03189168 INSPEC Abstract Number:' B88052777, C88045977 
Title: An experimental multimedia mail system 

Author (s): Postel, J.B.; Finn, G.G.; Katz, A.R.; Reynolds, J.K. 
Author Affiliation: Univ. of Southern California, Marina del Rey, CA, USA 
Journal: ACM Transactions on Office Information Systems vol.6, no.l 
p. 63-81 

Publication Date: Jan. 1988 Country of Publication: USA 
CODEN: ATOSDO ISSN: 0734-2047 

U.S. Copyright Clearance Center Code: 0734-2047/88/0100-0063$01 . 50 
Language : English 

Abstract: With multimedia computer-based mail, a user may create 
messages containing text / image, and voice data and send such 
messages to other users within a computer network . The authors describe 
the development, implementation, and use of one such system. They present 
an overview of the system, the system model, the presentation model, the 
multimedia mail program for the user ! s point of view, and plans for 
future work. 

...Identifiers: computer network 
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(c) 2000 Institution of Electrical Engineers. All rts . reserv. 

02681182 INSPEC Abstract Number: D86001696 
Title: Taking an independent line (telecommunication networks) 

Author (s) : Horwitt, E. 

Journal: Business Computer Systems vol.5, no. 2 p. 26-32 
Publication Date: Feb. 1986 Country of Publication: USA 
CODEN: BCOSDI ISSN: 0745-0745 
Language: English 

...Abstract: deals, price breaks and an expanding menu of services for 
wide area networks. And the time is right to gear up for Integrated 
Services Digital Network (ISDN), the emerging standard that in a year or 
so should enable users to send video images, data, text and voice 
over the same digital lines. But it is a difficult time, too, requiring 
decisions among telecommunications and MIS managers. Not only must they 
find the. . . 

. . . combination of communications paths in a wilderness of vendors and 
options, but they must also decide who assumes responsibility for 
maintaining, monitoring and managing the network -the corporation or the 
telephone company. 

...Identifiers: Integrated Services Digital Network ; 
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02607476 INSPEC Abstract Number: B86016826, C86013776 
Title: The DARPA experimental multimedia mail system 
Author (s): Reynolds, J.k.; Postel, J.B.; Katz, A.R.; Finn, G.G.; DeSchon, 
A.L. 

Author Affiliation: Inf. Sci. Inst., Univ. of Southern California, Marina 
del Rey, CA, USA 

Journal: Computer vol.18, no. 10 p. 82-9 
Publication Date: Oct. 1985 Country of Publication: USA 
CODEN: CPTRB4 ISSN: 0018-9162 

U.S. Copyright Clearance Center Code: 0018-9162/ 85/1000-0082$01 . 00 
Language: English 

. . .Abstract: of the Defense Advanced Research Projects Agency are 
described. This ongoing experiment extends computer mail to include bit 
map, voice, and other data. With this system , users can create messages 
containing text / image, and voice data and send such messages to 
other users in the ARPA Internet . Current work focuses on programs and 
protocols to reach a wider community of users - 

...Identifiers: ARPA Internet ; 
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02176402 INSPEC Abstract Number: B84006510, C84005430 
Title: Users networks for future offices 
Author (s): Necas, J. 

Journal: Mechanizace Automatizace Administrativy vol.23, no. 9 p. 
340-1 

Publication Date: 1983 Country of Publication: Czechoslovakia 
CODEN: MAUAAU ISSN: 0322-8452 
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Language: Czech 

Title: Users networks for future offices 

Abstract: Discusses the development of local area networks (LAN) for 
future electronic offices. Requirements imposed on networks which enable 

transmission of data, texts , pictures and voice are discussed and the 
use of coaxial cables as well as optical fibre cables is considered. 
Examples of wide-band user network adopted in the USA and advanced 

European Countries are presented and future development trends are 
discussed. It is concluded that while in the 1980s the... 
...Identifiers: wide-band user network ; 
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02050014 INSPEC Abstract Number: B83031157, C83021193 
Title: The structure and operating principles of a 64k-bit/s model 

network 
Author(s): Peter, E. 

Journal: Fernmelde-Praxis vol.60, no . 3 p. 81-94 

Publication Date: 10 Feb. 1983 Country of Publication: West Germany 
CODEN: FEPXAP ISSN: 0015-0118 
Language: German 

Title: The structure and operating principles of a 64k-bit/s model 

network 

Abstract: The purpose in the development of this model network was to 
use internationally standardised and compatible techniques not merely for 
the telephone coverage of large areas but also to handle computer to 
computer communications, transmission of facsimiles, decentralised printing 
and high speed data transmission. It will also be used to gather 
information on the various functions of the network and its capabilities. 
The subscriber was provided with a basic 64 kbit/s channel for 

transmitting data, texts , or speech and a 2.4 kbit/s channel to 
control the transmission of data and texts. The author concludes that, 
besides the approach to the operation... 
Identifiers: digital subscriber loop... 

. . .model network ; ... 

- - - subscriber ; 
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02018860 INSPEC Abstract Number: B83020316 

Title: 64-kbit/s switching of text , data and voice using the EDS 
switching system 

Author(s): Hagen, R. 

Author Affiliation: Siemens AG, Munich, West Germany 

Conference Title: GLOBECOM f 82. IEEE Global Telecommunications Conference 
p. 549-52 vol.2 

Publisher: IEEE, New York, NY, USA 

Publication Date: 1982 Country of Publication: USA 3 vol. xxi+1383 
pp. 

U.S. Copyright Clearance Center Code: CH1819-2/82-0000-0549$00 . 75 
Conference Sponsor: IEEE 
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Conference Date: 



29 Nov. -2 Dec. 1982 



Conference Location: Miami, FL, 



USA 



Language: English 

Title: 64-kbit/s switching of text , data and voice using the EDS 
switching system 

Abstract: The German Post Office (DBP) intends to make a switched digital 
network with n 64-kbit/s (n<or=4) full-duplex circuits available in 1983. 
The existing EDS switching nodes in the DBP 1 s Integrated Text and Data 
Network will be responsible for switching through the 

bit-sequence-independent connections via which the user will be able to 
transmit text , data and voice - The signaling for a 64-kbits/s 

connection is effected, in line with CCITT Recommendation X.21, in a 
medium-speed out-slot channel. The... 

...Identifiers: switched digital network ; ... 

. . . Integrated Text and Data Network ; 
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2068775 NTIS Accession Number: AD-A340 317/7/XAB 
Voice Technology Study Report 

(Study rept) 

Mogford, R. M. ; Rosiles, A. ; Wagner, D. ; Allendoerf er , K. R. 

Federal Aviation Administration Technical Center, Atlantic City, NJ. 

Corp. Source Codes: 015213000; 411863 

Report No.: DOT/ FAA/ CT-TN97/2 

Dec 97 29p 

Languages : English 

Journal Announcement: GRAI9814 

Product reproduced from digital image. Order this product from NTIS by: 
phone at 1-800-553-NTIS (U.S. customers); (703)605-6000 (other countries); 
fax at (703)321-8547; and email at orders ntis.fedworld.gov. NTIS is 
located at 5285 Port Royal Road, Springfield, VA, 22161, USA. 

NTIS Prices: PC A03/MF A01 

This document presents the findings of a voice technology study that 
evaluated the potential of a speech to text and voice recognition 

system to support an Airway Facilities maintenance task. Researchers 
conducted the test at an Airport Surveillance Radar (ASR) -9 site at the 
William J. Hughes Technical Center. Thirteen Airway Facilities specialists 
completed the procedure twice, once with the voice technology system and 
again with a . . . 

. . . was no more time consuming or difficult to use than a traditional paper 
manual. The voice recognition rate was 86.6%. Questionnaire responses 
showed that users found the voice technology system understandable, easy 
to control, and responsive to voice commands. When asked to compare voice 
technology to the use of a... 

Descriptors: Speech recognition; *Voice communications; *Air traffic 
control terminal areas; Aircraft maintenance; Performance (Human) ; Human 
factors engineering; Feasibility studies; Speech transmission ; Man 
computer interface; Workload; Air traffic controllers; Machine coding; 
User friendly 
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Comp&distr 2000 NTIS, Intl Cpyrght All Right. All rts . reserv. 

1272338 NTIS Accession Number: AD-A173 280/9 
ISI (Information Sciences Institute) Experimental Multimedia Mail System 

(Research rept) 

Postel, J. B. ; Finn, G. G. ; Katz, A. R. ; Reynolds, J. K. 

Information Sciences Inst., Marina Del Rey, CA. 

Corp. Source Codes: 083386000; 415543 

Report No.: ISI/RR-86-173 

Sep 86 31p 

Languages : English 

Journal Announcement: GRAI8703 

Order this product from NTIS by: phone at 1-800-553-NTIS (U.S. 
customers); (703)605-6000 (other countries); fax at (703)321-8547; and 
email at orders@ntis.fedworld.gov. NTIS is located at 5285 Port Royal Road, 
Springfield, VA, 22161, USA. 

NTIS Prices: PC A03/MF A01 

With multimedia computer mail, a user may create messages containing 
text , image, and voice data and send such messages to other users 
within a computer network . This paper describes the development, 
implementation, and use of one such system. The following five sections 
describe the overview of the system, the system model, the presentation 
model, the multimedia mail program for the user 1 spoint of view, and plans 
for future work. This mail system discusses a computer-based experimental 
multimedia mail system that allows the user to read, create, edit, send, 
and receive messages containing text / images, and voice . 

Descriptors: Electronic mail; ^Computer communications; Message 
processing; User needs; Editing; Facsimile communications; Text 
processing; Image processing; Voice communications 



26/3, K/15 (Item 3 from file: 6) 

DIALOG (R) File 6: NTIS 
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1228298 NTIS Accession Number: AD-A163 536/6 

DARPA (Defense Advanced Research Projects Agency) Experimental Multimedia 
Mail System 

(Research rept) 

Reynolds, J. K. ; Postel, J. B. ; Katz, A. R. ; Finn, G. G. ; DeSchon, A. 

L. 

University of Southern California, Marina del Rey. Information Sciences 
Inst. 

Corp. Source Codes: 045598002; 407952 
Report No.: ISI/RS-85-164 
Dec 85 12p 

Languages: English Document Type: Journal article 
Journal Announcement: GRAI8 610 

Pub. in IEEE Computer Magazine, p82-89 Oct 85. 

Order this product from NTIS by: phone at 1-800-553-NTIS (U.S. 
customers); (703)605-6000 (other countries); fax at (703)321-8547; and 
email at orders@ntis.fedworld.gov. NTIS is located at 5285 Port Royal Road, 
Springfield, VA, 22161, USA. 

NTIS Prices: PC A02/MF A01 

. . . describes the development, implementation, and use of an experimental 
multimedia mail system. About 40 researchers in 10 organizations have 
contributed to the experiment. With this system users can create 
messages containing text , image, and voice data, and send such 
messages to other users in the ARPA Internet . Keywords: ARPA Internet 
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Communication protocols; Computer mail; Electronic mail; and Multimedia. 
(Reprints ) 
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1124708 NTIS Accession Number: AD-A143 075/0 
Experimental Internetwork Multimedia Mail System 

(Research rept) 
Katz, A. R. 

University of Southern California, Marina del Rey. Information Sciences 
Inst . 

Corp. Source Codes: 045598002; 407952 

Report No.: ISI/RS-84-134 

Jun 84 14p 

Languages : English 

Journal Announcement: GRAI8421 

Order this product from NTIS by: phone at 1-800-553-NTIS (U.S. 
customers); (703)605-6000 (other countries); fax at (703)321-8547; and 
email at orders@ntis.fedworld.gov. NTIS is located at 5285 Port Royal Road, 
Springfield, VA, 22161, USA. 

NTIS Prices: PC A02/MF A01 

This paper describes the implementation and use of an experimental 
multimedia mall system, in particular the user interface program called 
MMM. Using MMM, it is possible for a user to create a multimedia message 
which may contain various types of text , image, and voice data and to 
then send the message to other hosts within the Department of Defense 
(DoD) Internet Environment. MMM is written in Pascal and runs on a PERQ 
personal computer equipped with a large bitmap display, a local hard disk, 
and a . . . 

. . . edited, or others created using a bitmap sketching program (which is 
also a part of MMM) . Section II of this paper briefly describes the DoD 
internet and the family of protocols used in this environment. The 
physical data connections between the PERQ running MMM and the various 
networks used are also discussed. Section III describes the specific 
protocol used. This protocol allows generated types of structured data to 
be transfered within the internet . Section IV describes the subset of 
this protocol implemented in MMM and gives a detailed account of how MMM 
works and how one would use. . . 

Descriptors: Message processing; *Data transmission systems; ^Computer 
communications ; Communications networks ; Installation; Media; Minicomputers 
; User needs; Interfaces; Voice communications; Editing 
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Title: Telephony based speech technology - from laboratory visions to 
customer applications 

Author : Johnston, Denis 

Corporate Source: BT Lab, Suffolk, UK 

Source: International Journal of Speech Technology v 2 n 2 Dec 1997. p 
89-99 

Publication Year: 1997 

CODEN: ISTEFM ISSN: 1381-2416 
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Language: English 

Title: Telephony based speech technology - from laboratory visions to 
customer applications 

Abstract: This paper describes how research into Automatic Speech 
Recognition (ASR) and Text to Speech Synthesis (TTS ) is being widely 
applied within the UK telephone network . It compares and contrasts 
telephony based speech technology with that used in non-telephony based 
applications and describes some of the special problems associated with 
integrating these into the existing telephone network . In particular, it 
highlights the main issues concerned with providing flexible, yet robust, 
multiple channel systems and shows how this has been achieved on a... 

Descriptors: Automatic telephone systems; Speech recognition; Speech 
synthesis; Telecommunication networks; Communication channels (information 
theory) ; Speech transmission 
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(c) 2000 Engineering Info. Inc. All rts. reserv. 

04785883 E.I. No: EIP97083777126 
Title: Experimental Japanese /English interpreting video phone system 

Author: Karaorman, Murat; Applebaum, Ted H.; Itoh, Tatsuro; Endo, Mitsuru 
; Ohno, Yoshio; Hoshimi, Masakatsu; Kamai, Takahiro; Matsui, Kenji; Hata, 
Kazue; Pearson, Steve; Junqua, Jean-Claude 

Corporate Source: Panasonic Technologies, Inc, Santa Barbara, CA, USA 

Conference Title: Proceedings of the 1996 International Conference on 
Spoken Language Processing, ICSLP. Part 3 (of 4) 

Conference Location: Philadelphia, PA, USA Conference Date: 
19961003-19961006 

E.I. Conference No.: 46796 

Source: International Conference on Spoken Language Processing, ICSLP, 
Proceedings v 3 1996. IEEE, Piscataway, NJ, USA, 96TH8206 . p 1676-1679 
Publication Year: 1996 
CODEN: 002642 
Language : English 

...Abstract: architectural design issues and experiences gained while 
building and demonstrating an experimental interpreting video phone (IVP) 
system. The IVP system has been demonstrated in an internet home shopping 
simulation simultaneously before live audiences in Japan and the U.S. An 
American shop assistant and a Japanese customer engaged in task-directed 
dialogues, using their native languages. In addition to their direct 
audio/visual contact by ISDN video phone, each participant heard a 
translation of the remote speaker's utterances in a synthetic voice in 
real-time. Each site used a medium-size vocabulary, a continuous speech 
recognition system and a text -to-speech synthesis (TTS ) system 
for the local language. Recognition results were transmitted over the 
internet to the remote site , where the corresponding translated sentence 
was spoken by TTS in the listener ! s native language. All of the speech 
and language processing software components of the system were 
independently developed proprietary technologies of the... 

Descriptors: Speech transmission ; Video telephone equipment; Speech 
synthesis; Linguistics; Wide area networks; Speech recognition 

Identifiers: Interpreting video phone (IVP) system; Internet 
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04484828 E.I. No: EIP96083298886 
Title: Development of the stand-alone audiotex system 

Author: Jeong, Youhyeon; Yi, Sionghun 

Corporate Source : Electronics and Telecommunications Research Inst 
(ETRI ) , Taejon, S Korea 

Conference Title: Proceedings of the 1996 International Conference on 
Communication Technology Proceedings, ICCT'96. Part 1 (of 2) 

Conference Location: Beijing, China Conference Date: 19960505-19960507 

E.I. Conference No.: 45212 

Source: International Conference on Communication Technology Proceedings, 
ICCT v 1 1996. IEEE, Piscataway, NJ, USA. p 441-444 
Publication Year: 1996 
CODEN: 002424 
Language : English 

Abstract: Audiotex is a general system that combines computers and 
telephones to deliver audio information by adopting text -to-speech 
(TTS ) technology. TTS is a technology that converts text messages into 
synthetic speech based on both linguistic analysis of the text and the 
acoustic knowledge of the production. . . 

...this system, we adopt the pitch synchronous overlap and add (PSOLA) 
algorithm as the synthesis method. The system is composed of a public 
switched telephone network interface unit, a main control unit, a data 
interface unit, and TTS synthesis unit. It can be applied to a variety of 
reading services when connected to a host computer and telephone network . 
(Author abstract) 7 Refs. 

Descriptors: Telephone systems; Speech synthesis; Information retrieval 
systems; Information technology; Audio acoustics; Sound reproduction; 
Computers; Algorithms; Telephone switching equipment; User interfaces 

Identifiers: Audiotex system ; Audio information; Text to speech 
technology; Speech sounds; Pitch synchronous overlap and add algorithm; 
Public switched telephone network ; Data interface unit 
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Title: Tecnologie vocali interattive sul campo: l'esperienza CSELT 

Title: Interactive voice technology at work: the CSELT experience 
Author: Billi, R. ; Canavesio, F. ; Ciaramella, A.; Nebbia, L. 
Corporate Source : CSELT 

Source: CSELT Technical Reports (Centro Studi e Laboratori 
Telecomunicazioni) v 23 n 1 Feb 1995. p 75-89 
Publication Year: 1995 
CODEN: CTRPEJ ISSN: 0393-2648 
Language : Italian 

. . .Abstract: paper is a survey of the speech technologies and 
applications developed at CSELT, some of which are employed in real 
services on the Italian telephone network - With the rise of significant 
speech recognition and text -to-speech applications, the activity of our 
lab encompasses now a broader set of activities, which range from defining 
and experimenting new algorithmic approaches to speech product... 

...technology research and describes two operative applications, a voice 
dialing service for large name directories, which is installed in the CSELT 
PABX, and an automated network service for directory assistance, which is 
now accessible to all the Italian telephone customers . (Author abstract) 
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13 Refs. 

Descriptors: Speech transmission ; Voice /data communication systems; 
Speech recognition; Algorithms; Telephone systems; Telecommunication 
networks; Automation; Private telephone exchanges 

Identifiers: Speech technology; Speech product engineering; Interactive 
voice technology; Automated network service; Voice dialing service; 
Directory assistance 
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Title: Multicast support for group communications. 

Author: Ngoh, L. H. 

Corporate Source: Univ of Manchester, Manchester, Engl 

Source: Computer Networks and ISDN Systems v 22 n 3 Oct 7 1991 p 165-178 

Publication Year: 1991 

CODEN: CNISE9 ISSN: 0169-7552 

Language : English 

. . .Abstract: into existing unicast communication systems to provide 
better support for group communications. Multicast services are becoming 
more important, as more and more of today's network workstation 
environments are used to provide group communications for the exchange of 
multimedia information left bracket 29 right bracket . These environments 
allow users to exchange information in the form of 'documents' containing 
text , graphics and voice ; some systems support both store-and-f orward 
(e.g., mail) and real-time (e.g., conferencing) material. In this paper, 
various multicast design issues are addressed and. . . 

...Descriptors: Voice /Data Integrated Services; DATA TRANSMISSION 



26/3, K/22 (Item 1 from file: 99) 

DIALOG (R) File 99: Wilson Appl . Sci & Tech Abs 
(c) 2000 The HW Wilson Co. All rts. reserv. 

2104866 H.W. WILSON RECORD NUMBER: BAST00023532 
New talk 

Bainbridge, Heather; 

Wireless Review v. 17 no6 (Mar. 15 2000) p. 18-22 
DOCUMENT TYPE: Feature Article ISSN: 1099-9248 

ABSTRACT: The marriage of wireless and Internet is fueling the 
development of voice access to data sources. Voice-recognition and text 
-to-speech services that allow users to search a web site or check 
their e-mail from a wireless phone are being implemented. Some wireless 
carriers already offer a service whereby customers can dial phone numbers 
or navigate their voice mail using voice commands. Internetspeech.com is 
beta testing a system that allows users to access e-mail and web sites 

via any telephone . Voice-recognition systems also offer a hands-free 
safety factor. 

DESCRIPTORS: Integrated voice data transmission ; ... 
. . . Internet telephony; 
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05916066 Genuine Article#: XG279 No. References: 36 

Title: Audio/video and synthetic graphics /audio for mixed media 

Author(s): Doenges PK (REPRINT) ; Capin TK; Lavagetto F; Ostermann J; 
Pandzic IS; Petajan ED 

Corporate Source: EVANS & SUTHERLAND COMP CORP, 600 KOMAS DR, POB 58700/SALT 
LAKE CITY//UT/84158 (REPRINT); ECOLE POLYTECH FED LAUSANNE, LIG, COMP 
GRAPH LAB/CH-1015 LAUSANNE/ /SWITZERLAND/ ; UNIV GENOA, DIST, DEPT 
TELECOMMUN COMP & SYST SCI/I-16145 GENOA/ /ITALY/ ; AT&T BELL LABS, RES 
LABS/HOLMDEL//NJ/07733; UNIV GENEVA, CU I, MIRALAB/CH-1211 GENEVA 
4/ / SWITZERLAND/ ; AT&T BELL LABS, LUCENT T EC HNOL/ MURRAY HILL//NJ/07974 

Journal: SIGNAL PROCESSING-IMAGE COMMUNICATION, 1997, V9, N4 (MAY), P 
433-463 

ISSN: 0923-5965 Publication date: 19970500 

Publisher: ELSEVIER SCIENCE BV, PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS 
Language: English Document Type: ARTICLE (ABSTRACT AVAILABLE) 

...Abstract: synthetic, aural and visual (A/V) information. The objective 
of this synthetic/natural hybrid coding (SNHC) is to facilitate 
content-based manipulation, interoperability, and wider user access 
in the delivery of animated mixed media, SNHC will support 
non-real-time and passive media delivery, as well as more interactive, 
real-time . . . 

...streamed A/V objects, and spatial- temporal integration of mixed media 
types. Composition, interactivity, and scripting of A/V objects can 
thus be supported in client terminals, as well as in content 
production for servers , also more effectively enabling terminals as 
servers , Such AIV objects can exhibit high efficiency in transmission 
and storage, plus content-based interactivity, spatial-temporal 
scalability, and combinations of transient dynamic data and... 

...that exploit spatial and temporal coherence over buses and networks. 

MPEG-4 responds to trends at home and work to move beyond the paradigm 
of audio /video as a passive experience to more flexible A/V objects 
which combine audio/ video with synthetic 2D/3D graphics and audio. (C) 
1997 Published by Elsevier Science B. . . 
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(c) 2000 Derwent Info Ltd 
File 347:JAPIO Oct 1976-2000/May(UPDATED 000915) 

(c) 2000 JPO & JAPIO 
File 344:Chinese Patents ABS Apr 1985-2000/Aug 

(c) 2000 European Patent Office 

Set Items Description 

51 213 ((TEXT? ?(2W) (SPEECH OR VOICE)))(5N) SYSTEM? ? OR TTS 

52 904 TEXT? ? (2N)(TRANSFORM? OR CONVERT? OR CONVERSION? OR SYNT- 
HES? OR (CHANGE? OR TURN?)(2N)INTO)(5N) (SOUND OR AUDIO? OR V- 
OICE? OR SPEECH) 

53 1040 S10RS2 

54 3 S3 (10N) ((WEB OR NETWORK OR W3 OR INTERNET OR INTRANET)(- 
5N)(SERVER? OR SITE?) OR WEB() PAGE?) 

55 220 AUDIO(2W)(WAVEFORM?ORWAVE()FORM?) 

56 1351 (PROSOD? OR ACCENTU AT? OR INTONATION?) 

57 35992 (SPEECH OR VOICE) (2N) (SYNTHES? OR GENERAT?) 

58 2180 CONCATENAT? 

59 1299 (SPEECH? OR SOUND? OR VOICE)(2N)(FRAGMENT? OR SAMPL?) 
Sll 1263 SYLLABLE? 

513 181328 (PITCH? OR DURATION OR APTITUDE OR (ATTACK OR DECAY)(2N) E- 

NVELOP? OR (SYNTHES? () INSTRUCT?) ) 

514 1945 ( (NATURAL OR HIGH()QUALITY)(3N) (SOUND? OR SPEECH?)) 

515 659 TEXT? ?(2W) (SPEECH OR VOICE OR SOUND) 

5 16 36722 1 (WEB OR NETWORK OR W3 OR INTERNET OR INTRANET OR SERVER? OR 

SITE? OR WEB() PAGE?) 

517 129 S15 AND S16 

518 62904 SYNTHESIZ? 

519 11 S17 AND S18 

520 7 S17 AND(S5:S6ORS8:S10ORS13:S14) 

52 1 307568 USER? OR CUSTOMER? OR CLIENT? OR SUBSCRIB? 

522 66 S17 AND S21 

523 63 S22 AND (SERVER? OR NETWORK) 

524 37182 (ROUT? OR DELIVER? OR SEND OR SENT OR TRANSMIT? OR TRANS- 

MIS? OR PASS? OR REMIT? )(5N) (AUDIO OR SPEECH OR VOICE OR - 
SOUND) 

525 1 8 (DOWNLOAD )(5N) (AUDIO OR SPEECH OR VOICE OR SOUND) 

526 22 S22 AND (S24 OR S18) 

527 15 S26NOTS19NOTS20 

528 310 S15 AND (S18 OR GENERAT?) 

529 85 S28 AND S21 

530 11 S29 AND(S5:S6ORS8:S10ORS13:S14) 

531 10 S30NOT(S19ORS20) 

532 83469 WAVEFORM? OR WAVE()FORM? ? 

533 36483 S32 AND (S 18 OR GENERAT?) 

534 34 S33 AND S15 

535 18 S34 AND(S6ORS8:S10ORS13:S14) 

536 17 S35NOT(S19ORS20ORS31) 

? 
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4/3,IC,K/l (Item 1 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

013102007 

WPI Acc No: 2000-273878/200024 

XRPX Acc No: N00-205313 
Communication system between email server and PSTN, to allow subscriber 
to send and receive messages, using dedicated internet server with 
text-to- speech conversion 

Patent Assignee: KORTEX INT SA (KORT-N) 

Inventor: AJJAN S; ZANZOURI F 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

FR 2783993 Al 20000331 FR 9811968 A 19980924 200024 B 

Priority Applications (No Type Date) : FR 9811968 A 19980924 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
FR 2783993 Al 33 H04M-011/00 
International Patent Class (Main) : H04M-011/00 

Communication system between email server and PSTN, to allow subscriber 
to send and receive messages, using dedicated internet server with 
text-to- speech conversion 

Abstract (Basic) : 

... and local equipment (6) connected to the PSTN which can be 

interrogated by a voice telephone (5) . The local equipment can interact 
with an email server (2) via the telephone network and store voice 

messages, after conversion from text format, for subsequent 
transmission to the telephone subscriber. 

Connection to an email message server over the PSTN and 
internet with conversion of text to voice . 



4/3,IC,K/2 (Item 2 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 

011471102 

WPI Acc No: 1997-449009/199741 

XRPX Acc No: N97-374168 
Accessing and retrieving information from interconnected networks e.g. 
internet - converting information content of web page from text 
to speech, signals hyperlink selections of web page into audio 
manner and allows selection of hyperlinks through use of DTMF signals 
generated from telephone 

Patent Assignee: NETPHONIC COMMUNICATIONS INC (NETP-N) 

Inventor: HAHN J S; KWAN R J; OLSEN L E; RHIE K H 

Number of Countries: 022 Number of Patents: 003 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9732427 Al 19970904 WO 97US3329 A 19970228 199741 B 

AU 9719851 A 19970916 AU 9719851 A 19970228 199803 

US 5953392 A 19990914 US 96609699 A 19960301 199944 
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Priority Applications (No Type Date) : US 96609699 A 19960301 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9732427 Al E 57 H04M-002/00 

Designated States (National) : AU CA JP KR 

Designated States (Regional) : AT BE CH DE DK ES FI FR GB GR IE IT LU MC 
NL PT SE 

AU 9719851 A H04M-001/00 Based on patent WO 9732427 

US 5953392 A H04M-001/64 

International Patent Class (Main) : H04M-001/00; H04M-001/64; H04M-002/00 

. . . converting information content of web page from text to 

speech, signals hyperlink selections of web page into audio manner 
and allows selection of hyperlinks through use of DTMF signals generated 
from telephone 

4/3,IC,K/3 (Item 3 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

011373427 ^^^^^ 

WPI Acc No:Cl997£3>1334/199732 , ' 

XRPX Acc No: N9T= r 291138 
Audio access system for resources in wide area network, e.g. Internet - 
uses audio enabled pages created to link particular text data which can 
be from WWW and can be retrieved by audio web server for interpreting 
pages into audio which is displayed at audio interface 

Patent Assignee: UNIV RUTGERS STATE NEW JERSEY (RUTF ) 

Inventor: IMIELINSKI T; VIRMANI A 

Number of Countries: 071 Number of Patents: 002 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9723973 Al 19970703 WO 96US20409 A 19961220 199732 B 

AU 9715664 A 19970717 AU 9715664 A 19961220 199745 

Priority Applications (No Type Date) : US 959153 A 19951222 
Patent Details:- " 
Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9723973 Al E 33 H04L-012/16 
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AU 9715664 A H04L-012/16 Based on patent WO 9723973 

International Patent Class (Main) : H04L-012/16 
International Patent Class (Additional) : H04M-001/64 

. . .Abstract (Basic) : system for providing audio access to resources in *a 
wide area network generates an audio enabled p_aj 3£- by selectively 

choosing data from the resources . An^audio web server provides 

tejct___to^audio conversion of the audio enabled page. A coTm'e^ftTion 

is established to the audi*©- web server from an audio interface. 
Information is selected and retrieved from the audio enabled page in 
response to input entered over the connection. The retrieved 
information . . . 
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19/3,IC,K/2 (Item 2 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 



013247043 

WPI Acc No: 2000-418925/200036 

XRPX Acc No: N00-313530 
Edit system for telephone message, enables user to correct speech 
obtained from speech synthesizer such that corrected speech is provided 
as text for transmission over communication system 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ); IBM CORP (IBMC ) 

Number of Countries: 002 Number of Patents: 002 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 2000148182 A 20000526 JP 99187372 A 19990701 200036 B 

CN 1255011 A 20000531 CN 99110989 A 19990702 200045 

Priority Applications (No Type Date) : US 98185332 A 19981103 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 2000148182 A 32 G10L-015/22 

CN 1255011 A H04M-011/00 

International Patent Class (Main) : G10L-015/22; H04M-011/00 
International Patent Class (Additional): G06F-017/28; G10L-013/00; 
G10L-015/00; H04M-003/42 
Edit system for telephone message, enables user to correct speech 
obtained from speech synthesizer such that corrected speech is provided 
as text for transmission over communication system 

Abstract (Basic) : 

... A server receives voice input from user through telephone. A 

speech-recognition system converts the received voice to a text . A 
speech -synthesizer coverts the text to a synthesized speech to 
enable correction by user. The corrected voice is transmitted as text 
through a communication system. 

Since corrected speech can be transmitted as text , speech - 
synthesizer is not needed at receiver side to read the corrected 
message . . . 



19/3,IC,K/3 (Item 3 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 

013071032 

WPI Acc No: 2000-242904/200021 

XRPX Acc No: N00-183011 
Information processor for e-mail received from portable telephone, has 
judging unit to determine skip condition based on output from skip 
condition retainer so that mail adapted to skip condition is not read 

Patent Assignee: CANON KK (CANO ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 2000059511 A 20000225 JP 98220808 A 1998080 200021 B 

Priority Applications (No Type Date) : JP 98220808 A 19980804 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
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JP 2000059511 A 8 H04M-003/42 

International Patent Class (Main) : H04M-003/42 

International Patent Class (Additional): G06F-003/16; G06F-013/00; 
H04M-011/00 



Abstract (Basic): NOVELTY - The information processor (100) has a mail 
server (104) to manage mail, a mail retainer (102) to hold the 

currently processing mail and a speech synthesizer (103) to convert 
text to speech . A skip condition retainer (107) holds skip 

conditions about the mail as registered by the user. A skip condition 

judging unit (106) judges the condition... 

skip conditions is not read. DESCRIPTION OF DRAWING ( S) - The figure 
shows block diagram of information processor. (100) Information 
processor; (102) Mail retainer; (103) Speech synthesizer ; (106) 
Judging unit; (107) Skip condition retainer. . . 



19/3,IC,K/4 (Item 4 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



012889315 

WPI Acc No: 2000-061149/200005 
XRPX Acc No: N00-047869 
Error compensating device for speech data encoding system 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 
Inventor: BANTZ D F; ZAVREL R J 

Number of Countries : 001 Number of Patents : 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5987405 A 19991116 US 97881435 A 19970624 200005 B 

Priority Applications (No Type Date) : US 97881435 A 19970624 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5987405 A 13 G10L-005/00 

International Patent Class (Main) : G10L-005/00 
International Patent Class (Additional) : H04B-001/66 

Abstract (Basic) : 

signals are converted into digital signals, by A/D converter 
(10) . The digital signals are then converted into text representation 
by the recognizer (11) . The synthesizer (14) converts the text into 
original speech signal. A compensator (17) synchronizes the original 
speech signal and facsimile signal by correlation so that the minimum 
error component is compressed and effective bandwidth... 

For speech data encoding system used in deep space and submarine 
voice communication, battlefields and in internet . 



. . .Synthesizer (14 
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WPI Acc No: 1999-331711/199928 
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Call answering method for portable telephone, stationary telephone - 
involves passing audio to companion based on synthesized modification 
data 

Patent Assignee: HITACHI LTD (HITA ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 11119794 A 19990430 JP 97280054 A 19971014 199928 B 

Priority Applications (No Type Date) : JP 97280054 A 19971014 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 11119794 A 36 G10L-005/02 

International Patent Class (Main) : G10L-005/02 

International Patent Class (Additional) : G10L-003/00; G10L-003/02; 
H04M-001/64 

. . . involves passing audio to companion based on synthesized 
modification data 

. . .Abstract (Basic) : NOVELTY - The modification data like sound source set, 
sound volume parameter, message text , speech rate are read from the 
memory based on identified companion. The modification data are 
synthesized and corresponding audio is output to the companion. 
DETAILED DESCRIPTION - The key information like name, background sound 
and telephone number of calling party is extracted. . . 

. . .USE - For portable telephone, stationary telephone connected to 
internet . 
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(c) 2000 Derwent Info Ltd. All rts . reserv. 

012371791 

WPI Acc No: 1999-177898/199915 

XRPX Acc No: N99-131412 
Speech synthesis terminal equipment for electronic meeting system - has 
speech synthesizing unit that converts text information into speech 
synthesis signal when transmission destination identification information 
corresponds to identification information 

Patent Assignee: SANYO ELECTRIC CO LTD (SAOL ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 11032123 A 19990202 JP 97183372 A 19970709 199915 B 

Priority Applications (No Type Date) : JP 97183372 A 19970709 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 11032123 A 6 H04M-003/56 

International Patent Class (Main) : H04M-003/56 
International Patent Class (Additional) : G10L-003/00 

has speech synthesizing unit that converts text information into 
speech synthesis signal when transmission destination identification 
information corresponds to identification information 

...Abstract (Basic): NOVELTY - A speech synthesizing unit (36) converts a 
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text information into a speech synthesis signal when a transmission 
destination identification information corresponds to the 
identification information of a receiving. . . 

equipment. A transmitting data forming unit (33) creates the 
transmitting data containing the transmission destination 
identification information. The transmission destination identification 
information is formed by synthesizing the text information and the 
identification information. A network communication unit (34) is used 
to transmit the created transmitting data to other speech synthesis 
terminal equipment. . . 

ADVANTAGE - Enables synthesizing the speech of the text information 
that is sent from the speech synthesis terminal equipment of a 
transmitting agency. DESCRIPTION OF DRAWING (S) - The figure shows the 
block diagram of the speech synthesis terminal equipment. (11-14) 
Speech synthesis terminal equipment; (33) Transmitting data forming 
unit; (34) Network communication unit; (35) Identification 
information judging unit; (36) Speech synthesizing unit... 



19/3,IC,K/7 (Item 7 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

011916679 

WPI Acc No: 1998-333589/199829 

XRPX Acc No: N98-260347 
Generation method for parametric representation of speech - generating 
principal set and supplementary set of speech parameters and providing 
feedback using supplementary set of parameters to modify principal set of 
parameters 

Patent Assignee: MOTOROLA INC (MOTI ) 

Inventor: CORRIGAN G; KARAALI O; MASSEY N 

Number of Countries: 017 Number of Patents: 002 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9825260 A2 19980611 WO 97US18815 A 19971015 199829 B 

EP 932896 A2 19990804 EP 97946261 A 19971015 199935 

WO 97US18815 A 19971015 



Priority Applications (No Type Date) : US 96761627 A 19961205 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9825260 A2 E 28 G10L-000/00 

Designated States (Regional) : AT BE CH DE DK ES FI FR GB GR IE IT LU MC 

NL PT SE 

EP 932896 A2 E G10L-005/04 Based on patent WO 9825260 

Designated States (Regional) : BE DE FR GB 
International Patent Class (Main) : G10L-000/00; G10L-005/04 

...Abstract (Basic): Pref. the modified principal set of speech parameters 
is output to a waveform synthesizer to synthesize speech. The coder 
parameter generating system can be divided into a principal system arid 
a subsystem. The supplementary set of speech parameters consists of 
energies in each of a predetermined set of frequency bands for speech 
in a selected time period. The coder parameter generating system can be 
a neural network or a decision tree unit, or alternatively it can use 
a genetic algorithm. . . 

...USE - For speech synthesis system, e.g. converting text to speech . 
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. . . ADVANTAGE - Improves performance of text -to-speech system without 
increasing size of database used to create system 
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06562449 

EDITING SYSTEM AND METHOD USED FOR TRANSCRIPTION OF TELEPHONE MESSAGE 



PUB. NO. : 
PUBLISHED: 
INVENTOR (s) 



APPLICANT (s) 
APPL. NO. : 
FILED: 
PRIORITY: 

INTL CLASS: 



20-00148182 [JP 2000148182 A] 
May 26, 2000 (20000526) 
MUKUNDO PADOMANABUHAN 
MICHAEL PICHENY 
DAVID NAHAMUU 
SALIM ROOKOSU 

INTERNATL BUSINESS MACH CORP <IBM> 
11-187372 [JP 99187372] 
July 01, 1999 (19990701) 

185332 [US 185332], US (United States of America) , November 
03, 1998 (19981103) 

G10L-015/22; G06F-017/28; G10L-013/00; G10L-015/00; 
H04M-003/42 



ABSTRACT 

PROBLEM TO BE SOLVED: To correct a transcribed text with a voice by 
regenerating a synthesized speech, making a user correct the synthesized 

voice, and transmitting the corrected voice as a text through a 
communication system. 

SOLUTION: A telephone server 26 transfers a text and a diagnosis to a 
speech synthesizing server 34. The speech synthesizing server 34 

creates a synthesized speech and returns this synthesized speech to the 
telephone server 26. The telephone server 26 regenerates the 

synthesized speech to a user through telephone lines . One purpose of 
regenerating the synthesized speech to the user is to allow the user to 
correct an unacceptable or inaccurate region. The telephone server 26 
provides the user with an option of correcting a message. The regeneration 
of a voice related to a correcting mechanism 36 is achieved in many 
methods. When the user satisfies the transcription, the telephone server 

26 transmits the text. together with a recorded voice to a message server 

12. 



COPYRIGHT: (C) 2000, JPO 
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06323595 

INFORMATION DISTRIBUTION SYSTEM, INFORMATION TRANSMITTER, INFORMATION 
RECEIVER AND INFORMATION DISTRIBUTING METHOD 



PUB. NO.: 11-265195 [JP 11265195 A] 

PIJBLX-S-H-ED-: Sep fremb e-r-~-2.8.,,_r 99 9 (19990928) 

" INVENTOR (s ) : NAKATSUYAMA TAKASHI 
IMAI TSUTOMU 
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SONY CORP 

10-072811 [JP 9872811] 
March 20, 1998 (19980320) 

5538 [JP 985538], JP (Japan), January 14, 1998 (19980114) 
G10L-003/00; G06F-003/16; G06F-003/16; G06F-013/00; 
G06F-017/28; G10L-005/02 

ABSTRACT 

. . . SD) . On the side of information receivers 6 and 7, the text information 
is separated from the intermediate language information and displayed out, 
voices are synthesized while using the intermediate language information, 
and that synthetic voice information is outputted. Namely, as the 
intermediate language information, text data for voice synthesization 

in voice synthesizing processing are analyzed and information made into 
prescribed data format is transmitted f rom the server side (information 
transmitters) to the terminal equipment side (information receivers) . 

COPYRIGHT: (C) 1999, JPO 
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VOICE BROWSER SYSTEM 



PUB. NO. : 11-249867 [JP 11249867 A] 

PUBLISHED: September 17, 1999 (19990917) 
INVENTOR(s): NAMIKI IKUO 

HAYASHI HIROMICHI 

KANAMARU TETSUYA 

KIMEDA TSUNEJI 

UJIIE MAS AMI 

APPLICANT (s) : NIPPON TELEGR & TELEPH CORP <NTT> 

NTT ELECTORNICS CORP 
APPL. NO.: 10-048180 [ JP 9848180] 

FILED: February 27, 1998 (19980227) 

INTL CLASS: G06F-003/16; G06F-013/00; G06F-013/00 



ABSTRACT 

. . . BE SOLVED: To provide a voice^ ^browser system which enables even a 
visually handicapped person to acqufre the^WWW^fnf ormation . 

SOLUTION: This system includes a server 100 that has a voice request 
acquisition means 101 which acquires a request from a client 200 via the 
input of voices, a voice recognition. . . 

. . . which transmits a request to the URL that is designated by the client 
200 based on the recognition result of the means 102 to an internet 70, a 
voice .data generation means 104 which extracts a read-aloud text from the 
answer given from the internet 7 0 and converts the text into the voice 

data to synthesize the voices and a voice data transmission means 105 
which transmits the voice data generated by the means 104 to the client 
200. The system. . . 

. . . which inputs the requests given from the users in voices, a request 
issue means 202 which extracts the URL from the result acquired from the 

server 100 and gives a request of an HTML file to the server 100 based 
on the extracted URL and a voice output means 203 which outputs the voice 
data received from the server 100. 
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PUB. NO. : 
PUBLISHED: 
INVENTOR ( s ) : 
APPLICANT (s) 

APPL. NO. : 
FILED: 
INTL CLASS: 
JOURNAL : 



04-310049 [JP 4310049 A] 
November 02, 1992 (19921102) 
NISHIMORI HISAKIMI 

FUJI XEROX CO LTD [359761] (A Japanese Company or 
Corporation) , JP (Japan) 
03-101845 [JP 91101845] 
April 08, 1991 (19910408) 

[5] H04M-003/42; H04M-003/50; H04Q-003/58 
Section: E, Section No. 1336, Vol. 17, No. 
March 22, 1993 (19930322) 



140, Pg. 145, 



ABSTRACT 

... system is provided with work stations 6-1, 6-2, a protocol converter 
interface processor 2 suited to a communication protocol of a local area 

network 3, a database equipment server 4 storing a number of telephone 
sets 7-1, 7-2 corresponding to the work station connecting to a PBX and an 
address of the work station with cross reference and a voice synthesizer 

text voice conversion section 5 converting text information into a 

voice signal. 
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Remote monitoring method of interaction between call center attendant and 

caller in telecommunication system 
Patent Assignee: METRO ONE TELECOM INC (METR-N) 

Inventor: COX P M; GIRSCH J E; HUEY C A; KEPLER M A; LEE A S; POWELL A P 
Number of Countries: 085 Number of Patents: 002 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9959316 Al 19991118 WO 99US10268 A 19990511 200007 B 

AU 9939803 A 19991129 AU 9939803 A 19990511 200018 

Priority Applications (No Type Date) : US 9875780 A 19980511 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 9959316 Al E 48 H04M-003/00 

Designated States (National) : AE AL AM AT AU AZ BA BB BG BR BY 
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Designated States (Regional) : AT BE CH CY DE DK EA ES FI FR GB 

IE IT KE LS LU MC MW NL OA PT SD SE SL SZ UG ZW 
AU 9939803 A H04M-003/00 Based on patent WO 9959316 

International Patent Class (Main) : H04M-003/00 
International Patent Class (Additional) : H04L-012/66 

Abstract (Basic) : 

identification, destination party identification, geographical 
origination and destination of the call, date and time of the call, 
service provider, call center, call center attendant and duration of 
the call. An INDEPENDENT CLAIM is also included for remote monitoring 
apparatus between call center attendant and caller in telecommunication 
system. . . 

...in the call monitor. The interface with which the reviewer is connected, 
allows reviewer to access call recordings stored in the call monitor 
via a web browser or other interfaces, to enable speech recognition, 
speech-to-text conversion, text -to-speech conversion and to obtain 
information displayed on the call center attendant's terminal during 
the call etc . . . 
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WPI Acc No: 1999-559781/199947 
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Speech signal distribution system for computer network 
Patent Assignee: LERNOUT & HAUSPIE SPEECHPRODUCTS (LERN-N) 
Inventor: TEL M P 

Number of Countries: 001 Number of Patents: 001 
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Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5943648 A 19990824 US 96638061 A 19960425 199947 B 

Priority Applications (No Type Date) : US 96638061 A 19960425 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5943648 A 11 G10L-005/02 

International Patent Class (Main) : G10L-005/02 

Speech signal distribution system for computer network 

Abstract (Basic) : 

Text -speech parameter converter converts text containing 
sentences into a data stream with speech signal parameters representing 
spoken text and lacking phrase sentence level prosodic content. A 
supplemental parameter generator (128) inserts additional data 
representing linguistic boundaries which represent parameters 
associated with predefined boundaries into the data stream. 

For computer network including Internet for transmitting 
voice messages in encoded form and for generating animated pictures of 
a person speaking simultaneously with corresponding audio signal... 
...Title Terms: NETWORK 
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WPI Acc No: 1999-287158/199924 
XRPX Acc No: N99-214450 

Speaker access control method using text independent speech 

recognition e.g. for banking services 

Patent Assignee: INT BUSINESS MACHINES CORP (I BMC ) 
Inventor: KANEVSKY D; MAES S H 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5897616 A 19990427 US 97871784 A 19970611 199924 B 

Priority Applications (No Type Date): US 97871784 A 19970611 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5897616 A 15 G10L-009/08 

International Patent Class (Main) : G10L-009/08 

Speaker access control method using text independent speech 
recognition e.g. for banking services 

Abstract (Basic) : 

... A voice sample is taken from the utterances and processed 

against an acoustic model. A score corresponding to accuracy of decoded 
answer and closeness of. match between voice sample and acoustic 
model. The score is compared to predefined threshold value and when 
above it, speaker access to the server is permitted. An INDEPENDENT 
CLAIM is also included for speaker access control apparatus... 
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WPI Acc No: 1999-103329/199909 

Intonation generation method of a text-to- speech conversion system 
using intonation pattern normalization and neural network learning - 
NoAbs tract 

Patent Assignee: KOREA ELECTRONICS & TELECOM RES (KOEL-N) 
Inventor: HAN M S; KIM S H; LEE J C; LEE Y J 
Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

KR 97050108 A 19970729 KR 9555841 A 19951223 199909 B 

Priority Applications (No Type Date) : KR 9555841 A 19951223 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
KR 97050108 A G10L-005/00 
International Patent Class (Main) : G10L-005/00 

Intonation generation method of a text-to- speech conversion system 
using intonation pattern normalization and neural network learning. . . 
Title Terms: INTONATION ; 
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WPI Acc No: 1999-079765/199907 
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Text to speech profile interchange for text message chatting - uses 

the interchanging of the text to speech profile with inclusion of 

control code in the text message 
Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 
Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

RD 416110 A 19981210 RD 98416110 A 19981120 199907 B 

Priority Applications (No Type Date) : RD 98416110 A 19981120 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
RD 416110 A 1 G06F-000/00 

International Patent Class (Main) : G06F-000/00 

Text to speech profile interchange for text message chatting. . . 

. . .uses the interchanging of the text to speech profile with inclusion 
of control code in the text message 

. . .Abstract (Basic) : Operation of the system commences once a network 

conversation connection between another person is commenced. The syst 
swaps the Text to speech (TTS) profile to represent the character 
of the person who is speaking on the opposite side. Characteristics 
available include male or female tone, frequency and pitch of 
speaker, and volume of the intonation . 



. . .ADVANTAGE - Reduces the system and network traffic and offers a text 
orientated human readable file 
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XRPX Acc No: N98-213896 
Generation of segment durations in text-to- speech system - mapping 
sequence of phones to sequence of articulatory features, using prominence 
and boundary information as well as predetermined set of rules for type, 
phonetic context and syntactic and prosodic context 

Patent Assignee: MOTOROLA INC (MOTI ) 

Inventor: CORRIGAN G; KARAALI O; MASSEY N 

Number of Countries: 018 Number of Patents: 003 

Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


WO 9819297 


Al 


19980507 


WO 97US18761 


A 


19971015 


199824 


EP 876660 


Al 


19981111 


EP 97946842 
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19971015 


199849 








WO 97US18761 
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19971015 




US 5950162 
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19990907 


US 96739975 
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19961030 


199943 



Priority Applications (No Type Date) : US 96739975 A 19961030 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9819297 Al E 24 G10L-003/02 

Designated States (Regional): AT BE CH DE DK ES FI FR GB GR IE IT LU MC 

NL PT SE 

EP 876660 Al E G10L-003/02 Based on patent WO 9819297 

Designated States (Regional) : BE DE FR GB 
US 5950162 A G10L-005/06 

International Patent Class (Main) : G10L-003/02; G10L-005/06 
International Patent Class (Additional) : G10L-009/00 

Generation of segment durations in text-to- speech system. . . 

. . .phones to sequence of articulatory features, using prominence and 
boundary information as well as predetermined set of rules for type, 
phonetic context and syntactic and prosodic context 

. . .Abstract (Basic) : The method for generating segment durations in a text 
-to-speech system comprises generating an information vector for each 
segment description. The information vector includes a description of a 
sequence of segments surrounding described segment and. . . 

. . .The information vector is supplied as an input to a pre-trained neural 
network - A description is generated representing the duration 
associated with the described segment... 
. . .ADVANTAGE - Avoids effects when network depends on chance correlations 

in training data and provides efficient segment durations... 
...Title Terms: DURATION ; 
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Text conversion method for generating audible signals using neural 

network - training neural network to associate text of recorded spoken 
messages with speech of spoken messages by converting recorded spoken 
messages into series of audio frames of fixed duration 

Patent Assignee: MOTOROLA INC (MOTI ) 

Inventor: CORRIGAN G E; GERSON I A; KARAALI O 

Number of Countries: 022 Number of Patents: 009 
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Priority Application 

19960322 
Patent Details : 
Patent No Kind Lan 
WO 9530193 Al E 

Designated States 

Designated States 

PT SE 
FI 9505608 A 
AU 9521040 A 
EP 710378 Al E 

Designated States 
JP 8512150 W 
AU 675389 B 

US 5668926 A 
CN 1128072 A 
CA 2161540 C E 
International Patent 
G10L-005/04; G10L- 



s (No Type Date) : US 94234330 A 19940428; US 96622237 A 



Pg Main IPC Filing Notes 
40 G06F-015/18 
(National): AU CA CN FI JP 

(Regional) : AT BE CH DE DK ES FR GB GR IE IT LU MC NL 



G10L-000/00 
G06F-015/18 

40 G06F-015/18 
(Regional) : DE 

40 G10L-003/00 
G06F-015/18 

19 G10L-005/06 
G06F-015/18 
G10L-005/04 
Class (Main) 

005/06 



Based on patent WO 9530193 
Based on patent WO 9530193 
FR GB SE 
Based on patent WO 9530193 
Previous Publ . patent AU 9521040 
Based on patent WO 9530193 
Cont of application US 94234330 

Based on patent WO 9530193 
G06F-015/18; G10L-000/00; G10L-003/00; 



Text conversion method for generating audible signals using neural 

network - ... 

. . training neural network to associate text of recorded spoken messages 
with speech of spoken messages by converting recorded spoken messages 
into series of audio frames of fixed duration 

. .Abstract (Basic) : of converting text into audible signals involves using 
recorded audio messages (204) which are converted into a series of 
audio frames (205) having a fixed duration (213) . Each audio frame is 
assigned a phonetic representation (203) and a target acoustic 
representation. The phonetic representation is (203) is a binary word 
that represents the phone and articulation characteristics of the audio 
frame. The target representation is a vector of audio information such 
as pitch and energy. . . 
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After training, the neural network is used in conversion of text 
into speech . Text that is to be converted is translated into a series 
of phonetic frames of the same form as phonetic representations (203) 
and having a fixed duration (213) . The neural network then produces 
acoustic representations in response to context descriptions (207) that 
include some of the phonetic frames. The acoustic representations are 
then converted into speech. . . 
Abstract (Equivalent) : A method for training and utilizing a neural 
network that is used to convert text streams into audible signals, the 
method comprising the steps of . . . 

wherein training a neural network utilizes the steps of... 

lb) dividing the recorded audio messages into a series of audio frames, 
wherein each audio frame has a fixed duration ; 



If) training a feed-forward neural network with a recurrent input 
structure to associate an acoustic representation of the plurality of 
acoustic representations with the context description of the each audio 
frame. . . 

a phonetic frame of the series of phonetic frames includes one of the 
plurality of phonetic representations, and wherein a phonetic frame has 
the fixed duration ; 



li) converting, by the neural network , the phonetic frame into one of 
the plurality of acoustic representations, based on the one of the 
plurality of context descriptions; and 

Title Terms: NETWORK ; 
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Communication method for use in IP-based telephone communication, 
involves converting voice data and selectively generated voice text to 
packetized signal which is then transmitted over packet switched network 

Patent Assignee: ERICSSON INC (TELF ) 
Inventor: HIRI F 

Number of Countries: 089 Number of Patents: 002 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 200033552 Al 20000608 WO 99US28215 A 19991129 200040 B 

AU 200017472 A 20000619 AU 200017472 A 19991129 200044 

Priority Applications (No Type Date) : US 98200879 A 19981130 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 200033552 Al E 22 H04M-007/00 

Designated States (National) : AE AL AM AT AU AZ BA BB BG BR BY CA CH CN 
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Designated States (Regional) : AT BE CH CY DE DK EA ES FI FR GB GH GM GR 

IE IT KE LS LU MC MW NL OA PT SD SE SL SZ TZ UG ZW 
AU 200017472 A H04M-007/00 Based on patent WO 200033552 

International Patent Class (Main) : H04M-007/00 
International Patent Class (Additional) : H04L-012/64 

use in IP-based telephone communication, involves converting voice 
data and selectively generated voice text to packetized signal which is 
then transmitted over packet switched network 

Abstract (Basic) : 

... data is then processed and applied with a work list. One or more 

speech patterns within the voice data is recognized to selectively 
generate voice text . The voice data and voice text are converted to 
j>ac k e_ti z ed sig nal which is then transmitted over a packet switched 
network . 

... In IP based telephone communication computer networks such as. 

internet . 

...Due to connection between two or more PCs over the internet , audio and 
video data generated in one PC is packetized and transported over the 
internet for display on J:he other PC^ so users may view each other 
while simultaneously speaking" to ~each other. Allows caller to view 
received video data while concurrently transmitting video and speech 
generated data 
...Title Terms: NETWORK 
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013192316 

WPI Acc No: 2000-364189/200031 

XRPX Acc No: N00-272527 
Information delivery system for Internet based subscriber network, 
updates information in playback device according to subscriber 
preferences, when device gets disconnected from subscriber PC 

Patent Assignee: L EXT RON SYSTEMS INC (LEXT-N) 

Inventor: KIKINIS D 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 6055566 A 20000425 US 985562 A 19980112 200031 B 

Priority Applications (No Type Date) : US 985562 A 19980112 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 6055566 A 8 G06F-015/16 

International Patent Class (Main) : G06F-015/16 

Information delivery system for Internet based subscriber network, 
updates information in playback device according to subscriber 
preferences , when device gets disconnected from subscriber PC 

Abstract (Basic) : 

A subscriber PC (123) downloads text documents from Internet 
connected host server (120) . When playback device (110) is connected 
to PC, text documents are stored. The device renders the text 
documents, as speech on-demand, when disconnected from PC. A radio 
broadcast unit and a receiver updates information in the device, 
according to subscriber preferences, when the device is disconnected 
from the PC. 

... .A host server (120) compiles information, stores subscriber 

preferences and sorts information. The server adjusts stored 
subscriber preferences in accordance with subscriber use patterns 
and delivers information, as text documents through Internet (100) . 
The host server codes text documents delivered to subscriber for 
controlling audio characteristics including inflection. An 
INDEPENDENT CLAIM is also included for multimedia information output 
procedure . . . 

...For providing various multimedia data to PC subscribers through 
Internet . 

...Facilitates connection of localized media sources, since a digital 
network can be replicated along with host server and can be 
distributed to different usage areas... 

. . .The figure shows over view diagram of Internet -based media delivery 
system. . . 

. . -Internet (100. . . 

...Host server (120... 

. - . Subscriber PC (123 

...Title Terms: SUBSCRIBER ; NETWORK ; 
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012934138 

WPI Acc No: 2000-105985/200009 
XRPX Acc No: N00-081397 

Electronic message delivering system e.g. for e-mail, voice mail for 

digital mobile phones 

Patent Assignee: LOGICA INC (LOGI-N) 

Inventor: FERNANDEZ D E; .HAYDEN B; HUDSON M; PETRI E D G 
Number of Countries: 019 Number of Patents: 001 
Patent Family: 
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WO 9965256 A2 19991216 WO 99US13183 A 19990610 200009 B 

Priority Applications (No Type Date) : US 9888781 A 19980610 
Patent Details: 
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Designated States (National) : JP 

Designated States (Regional) : AT BE CH CY DE DK ES FI FR GB GR IE IT LU 
MC NL PT SE 

International Patent Class (Main) : H04Q-007/00 
Abstract (Basic) : 

The user selected data is retrieved from the e-mail address of 
the user via internet through dial-up connection, LAN. The message 
is filtered by the user specified configurations and summarized with 
a message identifier. The message is then delivered to the user by 
message network such as public switched telephone network (PSTN) . 

The system consists of a digitized interactive voice response 
(IVR) capable of receiving message identifier and user instructions 
via data delivery interface protocols like SMTP, TAP etc. Text to 
speech system is provided for converting message text to speech 
for playing back message on user request. Reply e-mails with address 
derived f rom the identified e-mail can be sent through the voice 
mail notification server . The retrieval system repeatedly polls the 
user e-mail address for new messages where the polling depends on the 
e-mail activity. An INDEPENDENT CLAIM is also included for electronic 
message delivering. . . 

...Used for delivering messages such as e-mail, voice mail to digital 
mobile phones . . . 

. . .Achieves immediate notification of e-mail arrivals due to the repeated 
polling of e-mail address. Offers option to get data in text or in 
speech format due to usage of IVR. Selection of message is made 
possible by using filtering... 



27/3,IC,K/4 (Item 4 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

012865024 

WPI Acc No: 2000-036857/200003 

XRPX Acc No: N00-027633 
Transmitting information over mobile telephone network by general 
broadcasting - provides information to telephone users over restricted 
geographic region, including numbers which can be dialled for further 
information 

Patent Assignee: TELIA AB (TELI-N) 

Inventor: EMILSSON S 

Number of Countries: 001 Number of Patents: 001 
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Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

SE 9801267 A 19991010 SE 981267 A 19980409 200003 B 

Priority Applications (No Type Date) : SE 981267 A 19980409 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
SE 9801267 A 10 H04M-011/08 

International Patent Class (Main) : H04M-011/08 

International Patent Class (Additional): H04H-001/00; H04H-009/00; 
H04Q-007/22 

Transmitting information over mobile telephone network by general 
broadcasting. . . 

. . .provides information to telephone users over restricted geographic 
region, including numbers which can be dialled for further information 

. . .Abstract (Basic) : NOVELTY - The broadcast provides information which 

needs to be delivered to a large number of mobile telephone users , is 
carried out over a restricted geographic region, and contains 
information on telephone numbers that can be dialled to receive further 
information. IMAGING and COMMUNICATION - PREFERRED FEATURES : The 
information is short text-based and is sent over a GSM network using 
a short message service cell broadcast (SMSCB) system. . . 

...ADVANTAGE - Text or voice information can be sent directly by e.g. 
a seller, club or organisation to a large number of mobile telephone 
users . 

...Title Terms: NETWORK ; 



27/3,IC,K/5 (Item 5 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

012754110 

WPI Acc No: 1999-560227/199947 
XRPX Acc No: N99-413818 

Conversant- type voice recognition and command process for computer 

communication from remote location 

Patent Assignee: LUCENT TECHNOLOGIES INC (LUCE ) 
Inventor: YAKER R 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5950167 A 19990907 US 9813665 A 19980126 199947 B 

Priority Applications (No Type Date) : US 9813665 A 19980126 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5950167 A 15 G10L-009/06 

International Patent Class (Main) : G10L-009/06 

Abstract (Basic) : 

... A user -entered tone and voice signals transmitted from a 

telephone (21) to a controller (15) are converted as 

application-specific commands which are executed by a processor. The 
user is prompted with voiced queries in a VCS (16) to issue sequenced 
commands. The user interrupts an ongoing application program routine 
with voice commands to invoke new application program functions. 
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. A voice command system (VCS) (16) consists of a voice 

recognition unit (VRU) (17), a voice to text and text to voice 
converter (18) . The controller (15) connects the VCS and a personal 
computer (1) to a telephone network (20) . The voice to text converter 
consists of a software for converting voice commands and tone signals 
to application program-specific commands. The signals include... 

...printer, copier, facsimile, or an e-mail address. The processor executes 
the commands under the control of the controller to perform application 
program functions. The user interrupts the ongoing application 
program such as word processor (12) with voiced commands to invoke the 
new application program such as spread sheet (13), e... 

...The ability of a user to direct application program files on personal 
computer to a destination, by remotely- issued tone or voice commands 
greatly enhances the utility of personal computers... 

...The figure shows the block diagram of the controller, VCS and personal 
computer connected to the telephone network . 



...Telephone network (20 



27/3,IC,K/6 (Item 6 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



012712162 

WPI Acc No: 1999-518275/199943 
XRPX Acc No: N99-385451 

Self-contained intelligent radio for receiving broadcasts from both local 

radio stations and world wide web WWW 

Patent Assignee: QURESHEY S (QURE-I); QURESHEY W (QURE-I) 
Inventor: QURESHEY S; QURESHEY W 

Number of Countries: 083 Number of Patents: 002 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9938266 Al 19990729 WO 99US1001 A 19990119 199943 B 

AU 9923240 A 19990809 AU 9923240 A 19990119 200001 



Priority Applications (No Type Date) : US 9896703 A 19980612; US 9872127 A 

19980122 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 9938266 Al E 32 H04B-001/06 

Designated States (National) : AL AM AT AU AZ BA BB BG BR BY CA CH CN CU 
CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC 
LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL 
TJ TM TR TT UA .UG UZ VN YU ZW 

Designated States (Regional) : AT BE CH CY DE DK EA ES FI FR GB GH GM GR 

IE IT KE LS LU MC MW NL OA PT SD SE SZ UG ZW 
AU 9923240 A H04B-001/06 Based on patent WO 9938266 

International Patent Class (Main) : H04B-001/06 



Self-contained intelligent radio for receiving broadcasts from both local 
radio stations and world wide web WWW 

Abstract (Basic) : 

... A stored software program is configured to connect a modem (206) 

to an Internet service provider and receive digitized audio 
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broadcasts from the Internet service provider. The program is further 
configured to provide a select broadcast display that allows a user 
to selectably connect a program broadcast to the input of an audio 
amplifier (222) from the AM or FM radio station or the WWW. 

A display device (11) provides information to the user . A 
tuning control (114) is operated to receive radio frequency RF signals 
from the radio broadcast stations. The stereo speakers (106,108) are 
operably connected to the audio amplifier. The modem transmits and 
receives digital data over a communications network . A data storage 
device (210) stores the software program. . . 
Can be used for Internet telephony, voicemail, text -to-voice mail, 
voice-to-text electronic mail and voice activated commands... 



Allows user to receive Web radio broadcasts in a manner similar to 
the ease and low cost with which the user receives regular radio 
broadcasts. Relieves user of complicated tasks associated with 
installing and configuring computer software since user interface 
that is less like computer program and more like conventional radio is 
provided, thereby making radio easy to use. User can tune into Web , 
AM or FM broadcast with ease through tuning control. Has lower cost, 
smaller size, lower power consumption, less upkeep and maintenance and 
more convenience compared with full-fledged computer. Provides hardware 
and software necessary to receive digitized radio from Web without 
need for personal computer or other expensive equipment... 

Title Terms: WEB 



27/3,IC,K/7 (Item 7 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

012337095 

WPI Acc No: 1999-143202/199912 
XRPX Acc No: N99-104021 
Method for delivering electronic mail message from remote source to 
subscriber station - receives and stores email message, sends signal to 
subscriber station indicating message is waiting retrieval, sends 
request to read message, retrieves waiting message, converts it into 
speech message and sends this to subscriber station 
Patent Assignee: ERICSSON INC (TELF ) 
Inventor: NELSON M P 

Number of Countries: 081 Number of Patents: 003 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9905626 Al 19990204 WO 98US14974 A 19980720 199912 B 

AU 9886591 A 19990216 AU 9886591 A 19980720 199926 

US 6061718 A 20000509 US 97899772 A 19970723 200030 



Priority Applications (No Type Date) : US 97899772 A 19970723 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9905626 Al 17 G06F-017/60 
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AU 9886591 A G06F-017/60 Based on patent WO 9905626 

US 6061718 A G06F-013/38 

International Patent Class (Main): G06F-013/38; G06F-017/60 
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International Patent Class (Additional) : G06F-015/17 

Method for delivering electronic mail message from remote source to 
subscriber station. . . 

...receives and stores email message, sends signal to subscriber station 
indicating message is waiting retrieval, sends request to read message, 
retrieves waiting message, converts it into speech message and sends this 
to subscriber station 

...Abstract (Basic): NOVELTY - The electronic email delivery system (44 to 
50) delivers email messages to and from a subscriber station (30) in 
a wireless system. The system converts the messages sent to the 

subscriber station from text to speech - The delivery system 
converts the email messages sent by the subscriber station from 

speech to text for delivery to a remote destination. DETAILED 
DESCRIPTION - Subscriber station is a mobile station and the message 
waiting signal is sent on an analog or digital control channel in the 
system. . . 

...USE - For delivering electronic mail messages in wired or wireless 

communications system, messages are of unrestricted length and sent to 
fixed or mobile subscriber who can learn contents of messages without 
being distracted from performing other activities . . . 

. . . ADVANTAGE - System does not restrict the length of the email message to 
a mobile subscriber and allows the subscriber to learn the contents 
of the message without being distracted from performing other 
activities. DESCRIPTION OF DRAWING (S) - The drawing shows a block 
diagram of an email delivery system. (44) email server ; (50) base 
station; (30) subscriber station. . . 

...Title Terms: SUBSCRIBER ; 



27/3,IC,K/8 (Item 8 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



012275506 

WPI Acc No: 1999-081612/199907 
XRPX Acc No: N99-058697 
Information transmission method for telecommunications networks - has 

subscriber requesting information and response text-to- speech coded 
with telephone access point signal conversion. 
Patent Assignee: TELECOM PTT FORSCHUNG & ENTWICKLUNG (TELE-N) ; SWISSCOM AG 

(SWIS-N) 
Inventor: VAN KOMMER R 

Number of Countries: 079 Number of Patents: 003 
Patent Family: 

Kind Date Applicat No Kind Date Week 

B 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


WO 9859486 


Al 


19981230 


WO 97CH246 


A 


19970620 


199907 


AU 9730864 


A 


19990104 


AU 9730864 
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19970620 


199921 








WO 97CH246 
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19970620 




EP 993730 


Al 


20000419 


EP 97925810 


A 


19970620 


200024 








WO 97CH246 


A 


19970620 





(No Type Date) : WO 97CH246 A 19970620 
Filing Notes 



Priority Applications 
Patent Details: 

Patent No Kind Lan Pg Main IPC 
WO 9859486 Al F 35 H04M-003/50 

Designated States (National) : AL AM AT AU AZ BA BB BG BR BY CA CH CN CU 



Dialog patbib 9139 a a 25723 



AB 7 



Report for SPE Fan Tsang 08/948328 September 27, 2000 09:01 



CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU 
LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA 
UG US UZ VN YU ZW 

Designated States (Regional) : AT BE CH DE DK EA ES FI FR GB GH GR IE IT 
KE LS LU MC MW NL OA PT SD SE SZ UG 
EP 993730 Al F H04M-003/50 Based on patent WO 9859486 

Designated States (Regional): AT BE CH DE DK ES FI FR GB GR IE IT LI LU 
MC NL PT SE 

AU 9730864 A H04M-003/50 Based on patent WO 9859486 

International Patent Class (Main) : H04M-003/50 

has subscriber requesting information and response text-to- speech 
coded with telephone access point signal conversion. 

. . .Abstract (Basic) : The information transmission method has a subscriber 
making a local telephone call to a telephone information service (1), 
for instance a weather forecast. . . 



...The information is coded in semantic form using Text to Speech 
conversion (TTS) and transmitted over the transmission network 
(10). Prior to the subscriber telephone (30) there is a convertor (2) 
which converts the text format to digital words for normal telephone 
reception. . . 

. . .ADVANTAGE - The transmission of the information using semantic code 

reduces transmission bandwidth and thus loading the network less than 
previous systems. . . 

...Title Terms: NETWORK ; SUBSCRIBER ; 



27/3,IC,K/9 (Item 9 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



012051070 

WPI Acc No: 1998-467980/199840 

XRPX Acc No: N98-364682 
Announcement provision method in communication network - using service 
control point in evaluation of supportability of announcements by unit 
which converts received text into message for caller 

Patent Assignee: SIEMENS AG (SIEI ) 

Inventor: NIMPHIUS K 

Number of Countries: 022 Number of Patents: 005 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


WO 


9837716 


A2 


19980827 


WO 


98DE377 
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19980211 


199840 B 
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19991208 
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98910604 
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19980211 


200002 










WO 
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CN 


1248377 
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20000322 
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98802749 
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19980211 


200032 


BR 


9807258 
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20000523 


BR 
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19980211 


200035 










WO 


98DE377 


A 


19980211 




JP 


2000509945 


W 


20000802 


JP 


98536142 


A 


19980211 


200042 










WO 


98DE377 


A 


19980211 





Priority Applications (No Type Date) : DE 1007060 A 19970221 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9837716 A2 G 22 H04Q-007/22 

Designated States (National) : BR CN JP KR US 

Designated States (Regional) : AT BE CH DE DK ES FI FR GB GR IE IT LU MC 
NL PT SE 
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EP 962106 A2 G H04Q-003/00 Based on patent WO 9837716 

Designated States (Regional): AT BE DE ES FR GB IT 
CN 1248377 A H04Q-003/00 

BR 9807258 A H04Q-007/22 Based on patent WO 9837716 

JP 2000509945 W 26 H04M-003/50 Based on patent WO 9837716 

International Patent Class (Main) : H04M-003/50; H04Q-003/00; H04Q-007/22 
International Patent Class (Additional) : H04M-003/42 

Announcement provision method in communication network - 

. . .Abstract (Basic) : The method involves networked mobile switching centres 
and visitor location registers (MSC/VLR) to which subscriber access 
terminals (MS) can be connected. Announcement texts are introduced into 
a service control point (SCP) . A message initiated on the basis of a 
subscriber r s call contains information on the supportability of 
announcements by an announcement unit (IP... 

...message is received and evaluated before another message containing the 
announcement is transmitted. The announcement unit receiving a text 
converts it into an announcement for transmission over a speech 
channel to the caller. . . 

...ADVANTAGE - Ensures only announcements are introduced into SCP. Ensures 
highly flexible system for implementing announcements by converting 
received text into speech . 

...Title Terms: NETWORK ; 



27/3,IC,K/10 (Item 10 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

012019931 

WPI Acc No: 1998-436841/199837 

XRPX Acc No: N98-340382 
Telecommunications system for deaf persons - has platform which routes 
call based on equipment type and which has signal detection circuitry 
detecting whether call is voice call 

Patent Assignee: AT & T CORP (AMTT ) 

Inventor: AUGUST K G 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5787148 A 19980728 US 95583144 A 19951228 199837 B 

Priority Applications (No Type Date) : US 95583144 A 19951228 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5787148 A 8 H04M-011/00 

International Patent Class (Main) : H04M-011/00 

International Patent Class (Additional) : H04M-003/42; H04M-007/00 

. . .Abstract (Basic) : The system is for use in a telephone network to 

process communications with a telecommunications relay centre. The... 

...destination for a text telephone party and information identifying the 
relay centre. The platform includes signal detection circuitry for 
determining that the call is a voice call. The platform routes the 
voice call to the relay centre, and the potential destination is 
identified to the relay centre in association with the voice call... 
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. . . ADVANTAGE - Allows users to have one telephone number for text and 
voice telephones . . . 



27/3,IC,K/ll (Item 11 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



011648696 

WPI Acc No: 1998-065604/199807 

XRPX Acc No: N98-051614 
Called number identity announcement e.g. for telephone system - involving 
calling party to receive voice announcement identifying called party 
prior to connection to called party allowing hangup 

Patent Assignee: AT & T CORP (AMTT ); AMERICAN TELEPHONE & TELEGRAPH CO 
(AMTT ) 

Inventor: SALIMANDO S C 

Number of Countries: 027 Number of Patents: 005 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


EP 


818913 


A2 


19980114 


EP 


97111859 
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19970711 
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JP 


10084410 
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19980331 


JP 


97185630 
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19970711 


199823 


CA 


2198797 
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19980112 


CA 


2198797 
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19970228 


199927 


US 


5970133 
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19991019 


US 


96678933 
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19960712 


199950 


MX 


9705116 


Al 


19980101 


MX 


975116 
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19970708 


199952 



Priority Applications (No Type Date) : US 96678933 A 19960712 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
EP 818913 A2 E 13 H04M-003/50 

Designated States (Regional) : AL AT BE CH DE DK ES FI FR GB GR IE IT LI 

LT LU LV MC NL PT RO SE SI 
JP 10084410 A 11 H04M-001/56 

CA 2198797 A H04Q-003/72 
US 5970133 A H04M-003/42 
MX 9705116 Al H04M-001/00 

International Patent Class (Main) : H04M-001/00; H04M-001/56; H04M-003/42; 

H04M-003/50; H04Q-003/72 
International Patent Class (Additional) : H04M-001/57; H04M-011/00; 

H04Q-003/42; H04Q-003/545 

. . .Abstract (Basic) : The announcement system then converts text data to 
voice , or passes on voice data, and delivers it to the calling 
party before the connection is finalised. The message identifies the 
called party and allows time for the calling party to hang... 

. . .ADVANTAGE - Allows users to ensure they have dialled correct number 
avoiding undesired charges and inefficient network use. . . 



27/3,IC,K/12 (Item 12 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 

010216031 

WPI Acc No: 1995-117285/199516 

XRPX Acc No: N95-092568 
Producing and processing text documents - setting up text document 
using speech before converting to text data using speech detector 
and allowing text to be corrected, edited and extended by speech 

Patent Assignee: ALCATEL SEL AG (COGE ); ALCATEL NV (COGE ) 
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Inventor: HUZENLAUB R; KOPP D; DE SANTIS G; RICCIO A; RIGOSI F 
Number of Countries: 013 Number of Patents: 004 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


EP 


644680 
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19950322 
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94113016 
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19940820 
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19930917 


199517 
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199539 
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5920835 
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US 
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US 


97869476 
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19970605 





Priority Applications (No Type Date) : DE 4331710 A 19930917 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
EP 644680 A2 G 12 H04M-003/42 

Designated States (Regional) : AT BE CH DE ES FR GB IT LI NL SE 
DE 4331710 Al 11 H04M-003/50 
JP 7193647 A 7 H04M-011/00 

US 5920835 A G10L-005/06 Cont of application US 94305849 

International Patent Class (Main): G10L-005/06; H04M-003/42; H04M-003/50; 
H04M-011/00 

International Patent Class (Additional) : G06F-003/16; G06F-013/00; 
G10L-003/00; G10L-005/02; G10L-007/08; H04M-011/10; H04N-001/00 

setting up text document using speech before converting to text 
data using speech detector and allowing text to be corrected, edited 
and extended by speech 

. . .Abstract (Basic) : text documents to be dictated and transmitted using a 
telecommunication device. Text is dictated in the form of speech. The 
speech is then converted to text data using speech recognition. The 
text data can be corrected by means of speech and can be edited in text 
data. The text can be transmitted as text data to subscribers via a 
telecommunication network . 



. . .The device for dictating and transmitting text documents includes a 

dictating machine (HS) . A device is provided for transmitting spoken 
speech to a speech detector (SEK) . The detector (SEK) converts 
speech into text data. Software is provided to correct (i) the text 
data using speech . Software is also provided to edit (ii) the text 
data. Another device transmits the text data to a further subscriber 



27/3,IC,K/13 (Item 13 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

008798331 

WPI Acc No: 1991-302345/199141 
XRPX Acc No: N91-231582 

Network order entry service for telecommunications system - can receive 

orders by facsimile, transforming data into text form using OCR 

circuitry, and stores text converted speech 
Patent Assignee: ANONYMOUS (ANON ) 
Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

TP 99105 A 19910925 TP 9199105 A 19910920 199141 B 

Priority Applications (No Type Date) : TP 9199105 A 19910920 



Dialog patbib 9139 a a 25723 



AB 11 



Report for SPE Fan Tsang 08/948328 September 27, 2000 09:01 

International Patent Class (Additional) : H04M-000/01 

Network order entry service for telecommunications system. . . 

. . .can receive orders by facsimile, transforming data into text form using 
OCR circuitry, and stores text converted speech 

. . .Abstract (Basic) : An automated Order Entry System (OES) resides in a 
telecommunications network and is arranged to receive information 
from callers desiring to place orders with a called party (subscriber 
) . The information may be entered by callers as speech and/or as touch 
tone digits, in response to voice prompts generated by, for example, an 
AT and T Conversant Voice Response System located in the network . 
Information entered by callers in speech form is processed by 
speech-to-text conversion circuitry and stored in an electronic 
mail-box assigned to the subscriber or combined with other orders, 
possibly converted to electronic data interchange format, and forwarded 
to the subscriber f s computer. The OES can also receive orders by FAX, 
transform the information to text form using optical character 
recognition circuitry, and combine the FAX orders with speech -based 
orders before being transmitted to the subscriber - (Dwg.No.0/0) 

Title Terms: NETWORK ; 



27/3,IC,K/l4 (Item 14 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

007836207 

WPI Acc No: 1989-101319/198914 

XRPX Acc No: N89-077303 
Multi -media mail system consolidating voice and text mail - has 
transmit -receive mode selectors between analog telephone network and 
paired voice and text mail centres 

Patent Assignee: HITACHI LTD (HITA ) 

Inventor: SHIBATA Y 

Number of Countries: 007 Number of Patents: 005 
Patent Family: 
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H04M-011/00; H04Q-003/00 

. . . has transmit- receive mode selectors between analog telephone network 
and paired voice and text mail centres 
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. . .Abstract (Basic) : The system has a voice mail system and a text mail 
system utilising an analog telephone network (1004). A centre data/ 
voice transmit /receive mode selector (1003) is provided between a 
paired voice mail centre (1002) and text mail centre (1000) and the 
analog telephone network . A terminal data/voice transmit / receive 
mode selector (1003) is provided between a paired voice mail terminal 
(1007) and text mail terminal (1006) and the analog telephone network 

...Text mail centre and voice mail centre are physically in one centre but 
are logically or functionally separated. Subscriber data, charge 
data, and voice mail and text mail control information are communicated 
between a text mail centre processor and a voice mail centre processor. 
When turn-off of a modem carrier is detected, a data/voice transmit 
/receiver mode selector provided at a predetermined section of the 
system selects a voice transmitter /receiver (voice mail centre, 
and microphone and speaker of the terminal) . When the modem carrier is 
detected and a predetermined specific data is also detected, the 
selector 

. . .Abstract (Equivalent) : A multimedia mail system having a voice mail 
system and a text mail system utilizing an analog telephone network 
(1004), comprising: a voice mail centre (1002) and a text mail centre 
(1001), and a centre data/voice transmit /receive mode selector 
(1003) provided between the voice mail centre and text mail centre, and 
said analog telephone network (1004); characterised by said centre 
data/voice transmit /receive mode selector (1003) being adapted to 
freely switch text data and voice data into one communication, 
whereby for switching voice data to text data a carrier detect signal 
(CD) and a data indicative of this switching is used; a terminal data/ 
voice transmit /receive mode selector (1003 1 ) provided between a 
voice mail terminal (1007) and a text mail terminal (1006) and said 
analog telephone network ; and the voice mail terminal (1007) and text 
mail terminal (1006) constituting a multimedia terminal being capable 
of sending and/or receiving voice data and. . . 
. . .Abstract (Equivalent) : A multimedia mail system utilises an analog 

telephone network and interconnects processors at a voice main centre 
and a text mail centre and provides data/voice transmit /receive 
mode selectors between the analog telephone network and the paired 
voice mail centre and text mail centre and between the analog 
telephonee network and paired voice mail terminal and text mail 
terminal so that voice and text data can be switched during 
communication to provide a consolidated voice... 
...Title Terms: NETWORK ; 
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ABSTRACT 

...SOLVED: To quickly transfer a massage between a remote place and a nurse 
station, efficiently provide information for a corresponding patient, and 
smoothly perform a text /voice mixed information transmission through 
a document such as chart... 

...SOLUTION: A portable radiocommuni cation slave machine 133 performs 
radiocommunication with a radiocommunication parent machine 132, and is 
connected to a local area network through a network connecting device 
130 having slave user control means 131. A nursing information system 
server 110 is further connected to the local area network . An input 
device 120 of the nursing information system server 110 is provided with 
a keyboard 121, a mouse 122, and a microphone 123, and an output device 124 
thereof is provided with a display. . . 
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Interactive prosody user interface in text-to- speech system, 

speech synthesizer system 
Patent Assignee: LUCENT TECHNOLOGIES INC (LUCE ) 
Inventor: TANENBLATT M A 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 6006187 A 19991221 US 96720759 A 19961001 200010 B 

Priority Applications (No Type Date) : US 96720759 A 19961001 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 6006187 A 11 G10L-005/02 

International Patent Class (Main) : G10L-005/02 

Interactive prosody user interface in text-to- speech system, 
speech synthesizer system 
Abstract (Basic) : 

Duration controller sets speaking rate relative word duration 
of selected words to be uttered by synthesized voice. A creation 
unit forms text string using selected words and prosody 
characteristic, to apply changed prosody characteristic to voiced 
output of at least one of displayed words as to which changed prosody 
characteristic is effected. 

The duration controller enables user to dynamically effect 
change in prosody characteristic for one of displayed words . Words 
and punctuation in text input into word boxes is selected using mouse 
click, after which it is displayed visually. The duration controller 
operates in conjunction with the display unit which has indicia of 
change in one prosody characteristic for the displayed words. An 
INDEPENDENT CLAIM is also included for the altering method of prosody 
characteristics of synthesized voice in text -to-speech system. . . 

...In text -to-speech system, speech synthesizer system for 

controlling acoustical characteristic of synthesized voice. . . 

. . .The prosody user interface includes unlimited undo feature which 

allows any changes that are made to be reversed, thus giving the user 
freedom to explore various alternatives while retaining the ability to 
return to the previous state... 

...The figure illustrates the flowchart for transmitting escape sequences 
relating to phrase contours to text -to-speech synthesizer process 

...Title Terms: PROSODY ; USER ; 
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WPI Acc No: 2000-085402/200007 

XRPX Acc No: N00-066931 
Integrated messaging and voice-free cellular telephone communication 
system for use by hearing impaired, mute and deaf person 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 

Inventor: BRUNET P T; ITTYCHERIAH A P; N ARAYANAS WAM I C; PICHENY M A; 

RAMABHADRAN B 
Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5995590 A 19991130 US 9835493 A 19980305 200007 B 

Priority Applications (No Type Date): US 9835493 A 19980305 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5995590 A 7 H04M-011/00 

International Patent Class (Main) : H04M-011/00 

Abstract (Basic) : 

... Text to speech converter (14) connected to text data input 

device (12) of telephone set, converts text of message into 
synthesized speech signals. Input key of device (12) represents 
separate entire group of selected words and phrases. Memory of 
converter (14) stores words and phrases in form of synthesized speech 
signals. Speech to text converter of other telephone set is connected 
through link. 

The speech to text converter converts the speech signal to text 
signals in response to speech signals from the text to speech 
converter of other telephone set... 

...provides immediate and interactive response. To simplify the task of 
typing or writing with input device, several preselected words or 
phrases are used by the user , thereby avoids guide person for deaf, 
mute and hearing impaired person. Exhibits automatic answering function 
when the hearing impaired person does not take the call... 

...and reconf igurable, thereby shorthand notation is facilitated and amount 
of typing is reduced and these techniques allow for more interactivity 
during call and also reduces duration of call... 

...Text to speech converter (14 



31/3,IC,K/3 (Item 3 from file: 350) 
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(c) 2000 Derwent Info Ltd. All rts. reserv. 

010107711 

WPI Acc No: 1995-008964/199502 

XRPX Acc No: N95-007432 
Message broadcasting unit in radio paging system - transmits message to 
specific users by means of coded message system and generates analog 
audio waveform 

Patent Assignee: IBM CORP (IBMC ); INT BUSINESS MACHINES CORP (IBMC ) 

Inventor: LEMAIRE C A; STRIEMER B L 

Number of Countries: 002 Number of Patents: 003 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 6237207 A 19940823 JP 93315254 A 19931215 199502 B 

US 5594658 A 19970114 US 92993278 A 19921218 199709 
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US 5613038 A 19970318 US 92993278 A 19921218 199717 

Priority Applications (No Type Date) : US 92993278 A 19921218; US 95469307 A 

19950606 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 6237207 A 9 H04B-007/26 

US 5594658 A 8 G06F-017/00 Div ex application US 92993278 

US 5613038 A 8 G10L-005/02 

International Patent Class (Main) : G06F-017/00; G10L-005/02; H04B-007/26 
International Patent Class (Additional) : G10L-009/00; H04M-001/64 

. . . transmits message to specific users by means of coded message system 

and generates analog audio waveform 

...Abstract (Basic): address (143) of the specific receiver. The receivers 
receive the transmitted message pattern using a receiver antenna and 
store it in a data buffer. The user operates a switch control unit in 
the receiver for choosing the stored data by means of mode control 
buttons (157. . . 

...program memory. When the message and receiver addresses coincides, a 
selector selects one message in the text portion of messages. The 
corresponding voice waveform is generated by the voice processor for 
the selected message. The analog output from the voice processor is 
amplified by an amplifier and fed to a speaker... 

. . .USE/ ADVANTAGE - Digital paging system. Facilitates individual 
transmission of messages according to user demand. . . 

...Abstract (Equivalent): switch means operable by a user of said 

portable communications receiver for choosing one message among said 
stored selected messages, and wherein said switch means includes... 

...a first switch for sending a current one of said messages in said 
sequence to a text -to-speech conversion means, wherein said text 
-to-speech conversion means is coupled to said storing means and 
responsive to said switch means, for producing analog speech waveforms 
directly corresponding to the text portion. . . 

...for choosing a next message in said sequence as said current message, 

and wherein another operation of said second switch increases the speed 
of said text -to-speech conversion means . . . 

. . .A communications system for transmitting multiple individually addressed 
messages to a large number of users at different locations, 
comprising. . . 

...a first switch operable by a user for choosing a current one of said 
messages . . . 

...a second switch operable by said user for choosing a previous one of 
said messages . . . 

...a third switch operable by said user for choosing a next one of said 
messages . . . 

...text -to-speech conversion means responsive to said switch means and 
coupled to said data storage means for generating analog speech 
waveforms directly representing the text portion of said chosen message 

...Title Terms: USER ; 
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Computer graphical message box location method for blind person - 
generating while noise when pointer is on message box but not on button 

and using test- to -speech system for keystroke announcements 
Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 
Inventor: MCKIEL F A 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5223828 A 19930629 US 91746838 A 19910819 199327 B 

Priority Applications (No Type Date) : US 91746838 A 19910819 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
US 5223828 A 9 H04Q-001/00 

International Patent Class (Main) : H04Q-001/00 

generating while noise when pointer is on message box but not on 
button and using test-to-speech system for keystroke announcements 

. . .Abstract (Basic) : When a message box first appears, the text contents 
are announced using a text -to-speech system. After the text is 
announced, the push buttons available to respond to or cancel the 
message box are also announced in order from left to right. Next, a 
homing singla is provided for finding the message box. The homing 
signla is a tone that increases in pitch as the pointer approaches 
the message box. When the pointer enters the message box, the message 
box text and the available push buttons are reannounced. . . 

...As long as the pointer is on a button, the system remains silent. If the 
user desires to select a push button other than the default, the 
user may move the pointer to the left toward the other buttons... 

. . -USE/ADVANTAGE - Allows blind person to access and use computer graphical 

user interface... 
...Title Terms: GENERATE ; 
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Designated States (Regional) : DE FR GB 
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Designated States (Regional) : DE FR GB 
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Text to speech converter for handicapped users - ... 

...times input to synthesiser with natural speech rhythm by rules 
identifying terms and recognising syntactic information 

...Abstract (Basic): USE/ADVANTAGE - By deaf persons or sufferers from 
speech impediments. Freely generated text sequence is synthesised 
with proper emphases and pauses, without intervention of attendant. 
(14pp Dwg.No.1/4) 

...Abstract (Equivalent): The converter for synthesising a speech signal 
has a word detector responsive to a freely generated text signal for 
detecting individual words in the text signal and developing a string 
of words to be synthesised. A categorising device analyses each word. . . 

. . .A syntax augmenting device considers each word in the string and inserts 
a pause generation signal in the string of words, before or after the 
considered word, when appropriate, based on the category of the 
considered word. The syntax augmenting device inserts the pause 
generation signal before or after the considered word when 
appropriate, based on the considered word's category and the category 
of the one of the words . . . 

...Title Terms: USER ; 
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VOICE SYNTHESIZER 

ABSTRACT 

PROBLEM TO BE SOLVED: To provide a voice synthesizer presenting text 
information in voice comprehensible to each user . 



. . . inserts an important part referring to an important part pattern table 
104; inserts a control command in the acoustic parameter based on those 
results; a prosody information generation part 106 generates prosody 

information, an acoustic parameter; and an acoustic processing part 107 
outputs vocally. Moreover, a user is identified in a user 
-identification part 108, and the contents of the change processing of the 
parsing result by the parsing result changing part 105 is controlled 
according to the user . 
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January 16, 1998 (19980116) 
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OKI ELECTRIC IND CO LTD [000029] (A Japanese Company or 
Corporation) , JP (Japan) 
08-162886 [JP 96162886] 
June 24, 1996 (19960624) 

[6] G10L-003/00; G06F-017/21; G10L-005/04 



TEXT VOICE CONVERTING DEVICE 



ABSTRACT 

PROBLEM TO BE SOLVED: To provide the text voice converting device in 
which synthesized sounds of many kinds of pronunciation styles are 

generated and the reading is conducted with the phoneme patterns matched 
with the liking of a user - 



...SOLUTION: A synthesis parameter generating section 13 takes out the 
corresponding voice piece data based on a phoneme symbol column from a 
voice piece data storage section 14 and generates voice synthesis rhythm 
parameters such as the duration of phonemes, the length of a pause, power 
and fundamental frequency patterns. An uttering style specifying section 17 
specifies one desired uttering style from plural... 

... styles covering a reading style to a conversation style. A synthesis 
parameter changing means 16 deforms the voice synthesis phoneme parameters 
in accordance with the user ' s specification made by the section 17. A 
voice synthesis section 15 synthesizes voices and outputs them in 
accordance with the voice synthesis phoneme parameters. 
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ABSTRACT 

PURPOSE: To provide the text recitation device which generates a natural 
sensational speech where tastes of individual device users are 
reflected. . . 



...CONSTITUTION: On the basis of rhythm parameters of a calm speech which 
are generated by the rhythms of an input text , ideal sensational 

speech rhythm parameters showing a specific feeling are generated from 
relative value information. An element piece selection part 110 selects and 
extracts the element pieces of the rhythm parameters which are closest to 
the. . . 



. . . the element pieces within a range wherein naturalness is held and puts 
them close to the feeling speech rhythm parameters to obtain a desired 
feeling synthesized speech. 
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11, 1992 (19920311) 



TEXT SOUND CONVERTER 



ABSTRACT 

PURPOSE: To obtain a desired synthesized tone equipped originally with 
reading, accent, intonation and breath, etc., which are originally 
controlled by a user , with simple configuration by providing mode set and 
control parts . . . 

...CONSTITUTION: When the mode set part 40 selects any one of modes such as 
a text / sound conversion mode, phoneme sound and meter symbol train 
output mode and phoneme sound and meter symbol train synthesizing mode 
according to a designation from an external part, the control part 50 
discriminates the mode and controls the input/output of a text and a 
phoneme sound and meter symbol train. When the text / sound conversion 
mode is set, the control part 50 inputs the text and a text analysis part 
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30 analyzes the text. Then, the result of the analysis is inparted through 
a sound synthesizing part 60 to a loudspeaker 61. When the phoneme sound 
and meter symbol train output mode is set, the control part 50 analyzes the 
text at the text analysis part 30 and outputs the generated pheneme sound 
and meter symbol train in the form of a character code to the external 
part. When the phoneme sound and meter symbol train synthesizing mode is 
set, the control part 50 directly outputs the phoneme sound and meter 
symbol train inputted from the external part, through the sound 

synthesizing part 60. Thus, the user can freely change the pheneme 
sound and meter symbol train and easily obtain the desired synthesized 

tone . 
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11, 1992 (19920311) 



TEXT SOUND CONVERTER 



ABSTRACT 

PURPOSE: To obtain a synthesized sound desired for a user with simple 
and easy operations by providing a means to control a phoneme sound and 
meter symbol train generating means... 

... auxiliary memory 23. When there is no symbol to show text analysis 
supporting information, the text is analyzed by the phoneme sound and meter 
symbol generating means 50. Then, a phoneme sound and meter symbol train 
required for reading a sentence as a sound is generated through a word 
division processing means 51, read processing means 52, accent application 
imparting means 53, pause and intonation setting means 54, and in a sound 
synthesizing part 60, the sound corresponding to the input text is 
synthesized and outputted from a loudspeaker 61. When the input text 
includes the symbol to show the text analysis supporting information, the 
designation of the symbol... 

...sentence can be read as intended. In the case of applying only a desired 
support ele ment to be designated, the other element is automatically 

generated by the pheneme sound and meter symbol train generating means 
50. 
9 
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Patent Assignee: MATSUSHITA ELECTRIC IND CO LTD (MATU ) ; MATSUSHITA DENKI 

SANGYO KK (MATU ) 
Inventor: PEARSON S 

Number of Countries: 026 Number of Patents: 002 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

EP 1005021 A2 20000531 EP 99309294 A 19991122 200035 B 

JP 2000231394 A 20000822 JP 99332612 A 19991124 200045 

Priority Applications (No Type Date) : US 98200335 A 19981125 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 
EP 1005021 A2 E 16 G10L-019/06 

Designated States (Regional) : AL AT BE CH CY DE DK ES FI FR GB GR IE IT 

LI LT LU LV MC MK NL PT RO SE SI 
JP 2000231394 A 48 G10L-013/00 

International Patent Class (Main) : G10L-013/00; G10L-019/06 
International Patent Class (Additional) : G10L-013/04 

Abstract (Basic) : 

Method consists in defining a filter model (12) to produce a 
filter (10), applying the speech signal to the filter to generate a 
residual signal, processing this by extracting time domain data to 
extract a set of data points defining a line of segments, calculating 
the length. . . 

...parameter. The steps are repeated (16) until the cost parameter is 

minimized. A second filter inverse to the first processes the extracted 
source signal to generate synthesized speech 

Method is for use in constructing text -to-speech and music 
synthesizers and speech coding systems... 

. . .Method produces a natural sounding waveform without distortions 
due to discontinuities... 
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012671893 

WPI Acc No: 1999-478000/199940 

XRPX Acc No: N99-355782 
Parametric synthetic text-to- speech generating method for 
percussive musical instrument e.g. plucked violin 

Patent Assignee: APPLE COMPUTER INC (APPY ) 
Inventor: CECYS M L 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 
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Patent No Kind Date Applicat No Kind Date Week 

US 5930755 A 19990727 US 94212602 A 19940311 199940 B 

US 97779424 A 19970107 

Priority Applications (No Type Date) : US 94212602 A 19940311; US 97779424 A 

19970107 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

US 5930755 A 16 G10L-005/02 Cont of application US 94212602 

International Patent Class (Main) : G10L-005/02 

Parametric synthetic text-to- speech generating method for 
percussive musical instrument e.g. plucked violin 

Abstract (Basic) : 

- ■ • A set of synthesizer control parameters representative of text 

to be spoken, is generated and recorded. Among the recorded sound 
samples , a voice source is selected. Based on the selected voice 
source, speech synthesizer control parameters are converted into 
output waveforms representative of synthetic speech to be spoken. 

An INDEPENDENT CLAIM is also included for the parametric 
synthetic system for the text -to- speech conversion... 

...For generating parametric synthetic text -to-speech used in 

non-human sound sources like electronic systems, talking teakettle, 
animal and percussive musical instrument e.g. snare drum, plucked 
violin. . . 

...In the synthetic text -to-speech generation , the output waveforms 
representative of synthetic speech, can be provided by selecting 
atleast one voice source in a speech synthesizer . 

...The figure shows the sub-segments of recorded sound sample used in 

text -to-speech conversion 
...Title Terms: GENERATE ; 
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DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

011955584 

WPI Acc No: 1998-372494/199832 

XRPX Acc No: N98-292139 

Text speech-synthesis apparatus for FM data multiplex broadcasting, 
VICS - has control unit that performs speech synthesis or rule synthesis 
depending on correspondence or non- correspondence of word identification 
attribute row and example pattern 

Patent Assignee: FUJITSU TEN LTD (FUTE ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 10149188 A 19980602 JP 96310890 A 19961121 199832 B 

Priority Applications (No Type Date) : JP 96310890 A 19961121 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 10149188 A 5 G10L-003/00 

International Patent Class (Main) : G10L-003/00 
International Patent Class (Additional) : G10L-005/02 
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Text speech-synthesis apparatus for FM data multiplex broadcasting, 
VICS . . . 

. .Abstract (Basic) : if the attribute row from the analyser corresponds 

with the example pattern, and performs speech synthesis if agreement is 
obtained. Otherwise, a speech pattern is generated , and the rule 
synthesis is performed in the intonation of the phonogram row using a 
pitch pattern. . . 

..The speech synthesis is performed using the intonation peculiar to 
connection words linking words to form one sentence. An example table 
stores the fitting example pattern consisting of the intonation used 
for speech synthesis. A fitting type rhythm generator forms the 
pitch pattern from the example pattern, and uses the pitch pattern 
to link the waveform of the audio unit of the phonogram row of the 
word row. . . 



36/3,IC,K/4 (Item 4 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 



011754279 

WPI Acc No: 1998-171189/199816 

XRPX Acc No: N98-136025 

Text speech synthesis method - setting prosodic information for 
phoneme sequence of each word of word sequence obtained by analysis of 
input text by referring to word dictionary with speech waveform 
sequence obtained from phoneme sequence of each word 

Patent Assignee: NIPPON TELEGRAPH & TELEPHONE CORP (NITE ) 

Inventor: ABE M 

Number of Countries: 025 Number of Patents: 003 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

EP 831460 A2 19980325 EP 97116540 A 19970923 199816 B 

JP 10153998 A 19980609 JP 97239775 A 19970904 199833 

US 5940797 A 19990817 US 97933140 A 19970918 199939 



Priority Applications (No Type Date): JP 97239775 A 19970904; JP 96251707 A 

19960924 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
EP 831460 A2 E 13 G10L-005/04 

Designated States (Regional): AL AT BE CH DE DK ES FI FR GB GR IE IT LI 

LT LU LV MC NL PT RO SE SI 
JP 10153998 A 10 G10L-003/00 

US 5940797 A G10L-005/02 

International Patent Class (Main): G10L-003/00; G10L-005/02; G10L-005/04 



Text speech synthesis method... 



..setting prosodic information for phoneme sequence of each word of word 
sequence obtained by analysis of input text by referring to word 
dictionary with speech waveform sequence obtained from phoneme sequence 
of each word 

. .Abstract (Basic) : reference to a word dictionary and identifying a 

sequence of words in the input text to obtain a sequence of phonemes of 
each word. A prosodic information on the phonemes is set in each 
word. Phoneme waveforms are selected from a speech waveform 
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dictionary which corresponds to the phonemes in each word to generate 
a sequence of phoneme waveforms - 



A prosodic information is extracted from input actual speech. One part 
of the extracted prosodic information and one part of the set 

prosodic information is selected. A synthesised speech is generated 
by controlling the sequence of phoneme waveforms with the selected 

prosodic information 

Title Terms: WAVEFORM ; 



36/3,10, K/5 (Item 5 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

011659818 

WPI Acc No: 1998-076726/199807 

XRPX Acc No: N98-061382 
Synthetic text-to- speech generating - converts speech synthesiser 
control parameters into output wave forms representative of synthetic 
speech to be spoken by selecting and combining at least two voice sources 

Patent Assignee: APPLE COMPUTER INC (APPY ) 

Inventor: CECYS M L 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5704007 A 19971230 US 94212488 A 19940311 199807 B 

US 96727845 A 19961004 

Priority Applications (No Type Date) : US 94212488 A 19940311; US 96727845 A 

19961004 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 

US 5704007 A 17 G10L-005/02 Cont of application US 94212488 

International Patent Class (Main) : G10L-005/02 
International Patent Class (Additional) : G10L-009/00 
Synthetic text-to- speech generating - ... 

. . -converts speech synthesiser control parameters into output wave forms 

representative of synthetic speech to be spoken by selecting and 
combining at least two voice sources 

...Abstract (Basic): The method involves generating a set of speech 

synthesiser control parameters representative of text to be spoken, and 
converting the speech synthesiser control parameters into output wave 
forms . The latter is representative of the synthetic speech to be 
spoken by selecting and combining at least two voice sources from a 
number of voice sources in a speech synthesiser. That generates a 
combined voice source and by passing the combined voice source through 
an acoustic model of a human vocal tract. . . 

. . .The number of voice sources has spectral content, which most closely 
matches that of the generated set of speech synthesiser control 
parameters and includes a normal voice source and a bright voice source 
voice source, representative of text to be spoken. The speech 
synthesiser control parameters are converted into output wave forms 
representative of the synthetic speech to be spoken by selecting and 
combining at least two voice sources from the number of voice sources 
in a speech synthesiser to generate a combined voice source... 
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ADVANTAGE - Provides multiple voice source, each of which has certain 
desirable spectral content such that more natural human like 
synthesised speech can be generated with reduced reliance on signal 
processing. . . 

Title Terms: GENERATE ; 



36/3,IC,K/6 (Item 6 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

011469438 

WPI Acc No: 1997-447345/199741 
XRPX Acc No: N97-372821 

Mandarin syllable-signal synthesis method - synthesising periodical 
waveform part by performing time proportionated-interpolation and 

resampling operation 
Patent Assignee: GUU H (GUUH-I) 
Inventor: GUU H 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

TW 309588 A 19970701 TW 96116039 A 19961224 199741 B 

Priority Applications (No Type Date) : TW 96116039 A 19961224 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
TW 309588 A 28 G10L-009/00 

International Patent Class (Main) : G10L-009/00 

. . . synthesising periodical waveform part by performing time 
proportionated-interpolation and resampling operation 

...Abstract (Basic): The method is based on time-domain waveform 

processing. The effect of non-linearly warping the formant trace is 
largely decreased when changing one of the values of the two 
parameters, duration and pitch -frequency trace. The method 

synthesizes the periodical-waveform part by performing a type of 
time-proportionated-interpolation and a type of resampling operation. 
This lets the flexibility of independent control of the three factors, 

duration , pitch -frequency trace, and vocal-track length, be largely 
increased. Among the three, the factor of vocal-track length is new. . . 

...When the values of the two factors, vocal- track length and pitch 

-frequency trace's height, are appropriately set, many distinct timbres 
can be synthesized by manipulating only a male's original syllable 
waveforms / e.g. the timbres of cartoon actors, children, women, and 
men . . . 

...USE/ADVANTAGE - For implementing prototype text -to-speech system 
which can utter sentences, in real-time, in the timbre specified by 
control messages within input text. For synthesis of dialogues of 
dramas. Has increased flexibility in independent control of parameters 
and capability to generate many timbres... 

...Title Terms: WAVEFORM ; 
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011087670 

WPI Acc No: 1997-065594/199706 

XRPX Acc No: N97-053924 
Speech synthesiser - converts input text to sequence of representations 
of syllables or other phonetic units and retrieves stored parts of data 
to generate corresp. waveforms, and defines constant duration for 
regular beat period 

Patent Assignee: BRITISH TELECOM PLC (BRTE ) 

Inventor: BREEN A P 

Number of Countries: 071 Number of Patents: 005 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


WO 


9642079 


Al 


19961227 


WO 


96GB1430 
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19960613 


199706 B 


AU 


9662311 


A 


19970109 


AU 


9662311 
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19960613 


199717 


EP 


832481 


Al 


19980401 


EP 


96920927 


A 


19960613 


199817 










WO 


96GB1430 


A 


19960613 




JP 


11507740 


W 


19990706 


WO 


96GB1430 


A 


19960613 


199937 










JP 


97502810 


A 


19960613 




AU 


713208 


B 


19991125 


AU 


9662311 


A 


19960613 


200006 



Priority Applications (No Type Date) : EP 95304079 A 19950613 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

WO 9642079 Al E 12 G10L-005/04 

Designated States (National) : AL AM AT AU AZ BB BG BR BY CA CH CN CZ DE 
DK EE ES FI GB GE HU IL IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK 
MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG US UZ VN 
Designated States (Regional) : AT BE CH DE DK EA ES FI FR GB GR IE IT KE 
LS LU MC MW NL OA PT SD SE SZ UG 

AU 713208 B G10L-005/04 Previous Publ . patent AU 9662311 

Based on patent WO 9642079 

AU 9662311 A G10L-005/04 Based on patent WO 9642079 

EP 832481 Al E G10L-005/04 Based on patent WO 9642079 

Designated States (Regional) : BE DE FR GB IT 

JP 11507740 W 14 G10L-003/00 Based on patent WO 9642079 

International Patent Class (Main): G10L-003/00; G10L-005/04 

International Patent Class (Additional) : G10L-005/02 



converts input text to sequence of representations of syllables or 
other phonetic units and retrieves stored parts of data to generate 
corresp. waveforms, and defines constant duration for regular beat 
period 

. .Abstract (Basic) : The speech synthesiser has a device for supplying a 
sequence of representations of phonetic units, and a device for 
retrieving stored portions of data to generate waveforms 
corresponding to the phonetic units. A device determines the durations 
for the phonetic units, and a processing device processes and adjusts 
the durations of the waveforms according to the determined durations 



...The determiner is operable to define a constant duration corresponding 
to a regular beat period and adjusts the duration depending on the 
nature of the phonetic unit and/or its context within the sequence. The 
device identifies word grouping in the sequence, and the. . . 

...USE/ADVANTAGE - E.g. for text -to-speech synthesisers... 
...Title Terms: GENERATE ; 



36/3,IC,K/8 (Item 8 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 



Dialog patbib 9139 a a 25723 



AB 6 



Report for SPE Fan Tsang 08/948328 September 27, 2000 09:20 



(c) 2000 Derwent Info Ltd. All rts . reserv. 
010997609 

WPI Acc No: 1996-494558/199649 

XRPX Acc No: N96-417079 
Audio synthesiser for text speech synthesis - has waveform super 
position processing part that produces source signal of audio data that 
drives vocal tract filter part 

Patent Assignee: TOSHIBA KK (TOKE ) 

Inventor: AKAMINE M; KAGOSHIMA T 

Number of Countries: 002 Number of Patents: 002 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 8254993 A 19961001 JP 9557773 A 19950316 199649 B 

US 5890118 A 19990330 US 96613093 A 19960308 199920 

Priority Applications (No Type Date) : JP 9557773 A 19950316 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 8254993 A 11 G10L-005/04 

US 5890118 A G10L-009/04 

International Patent Class (Main) : G10L-005/04; G10L-009/04 
International Patent Class (Additional) : G10L-009/10 
Audio synthesiser for text speech synthesis. . . 

. . .has waveform super position processing part that produces source 
signal of audio data that drives vocal tract filter part 

...Abstract (Basic): The synthesiser comprises a memory unit (21), which 
outputs selected waveforms from stored waveforms representing 
frames of source signals of audio data on passing information 
corresponding to the audio signal which is to be synthesized . The 
selected waveforms are interpolated by an interpolating unit (22) . 
Corresponding to two continuous outputs from the memory unit, which 
results in a source signal waveform of an audio data... 

...The source signal waveforms are subjected to superposition in the 

positions determined by a position determining unit (11) . Superposition 
of the source signal waveform is carried out by a superposition 
processing unit (23) is whose output drives a vocal track filter (15) . 
The vocal tract filter approximates the vocal... 

...USE/ADVANTAGE - For producing composite tone audio from informations 
like tone symbol string, pitch and tone continuation time length. 
Reduces variation in tone and pitch , thus providing smooth natural 
continuous composite tone... 

...Title Terms: WAVEFORM ; 



36/3,IC,K/9 (Item 9 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 

010684335 

WPI Acc No: 1996-181291/199619 

XRPX Acc No: N96-152330 
Speech synthesis method using concatenation and partial overlapping of 

waveforms - sub-dividing waveforms associated with voice sounds into 
intervals corresp. to responses of vocal duct to series of excitation 
impulses of cords and synchronous to fundamental frequency of each signal 

Patent Assignee: CSELT CENT STUDI LAB TELECOM SPA (CSEL ) 
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Inventor: FOTI E; NEBBIA L; SANDRI S 

Number of Countries: 012 Number of Patents: 

Patent Family: 



009 
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Date 
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Date 
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199619 B 


JP 


8110789 


A 


19960430 


JP 


95175553 


A 


19950620 
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ES 


2113329 


Tl 
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EP 
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199824 


US 
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19980630 


US 
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199833 


CA 
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200035 


JP 
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JP 
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19950620 


200043 



Priority Applications (No Type Date) : IT 94T0756 A 19940929 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
EP 706170 A2 E 25 G10L-005/04 

Designated States (Regional) : BE DE DK ES FR GB IT NL SE 



Based on patent EP 706170 



Previous Publ . patent JP 8110789 
International Patent Class (Main): G10L-000/00; G10L-003/00; G10L-005/04; 

G10L-009/00; G10L-009/12; G10L-013/08 
International Patent Class (Additional) : G10L-013/06 
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009/00 
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005/04 
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US 
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C E 




G10L- 


009/00 


JP 
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B2 


15 


G10L- 


013/08 



Speech synthesis method using 

waveforms - ... 



concatenation and partial overlapping of 



. . . sub -dividing waveforms associated with voice sounds into intervals 
corresp. to responses of vocal duct to series of excitation impulses of 
cords and synchronous to fundamental frequency of 

...Abstract (Basic): The speech signal synthesis method involves using 

time-concatenation of waveforms representing elementary speech. The 
waveforms associated with voice sounds are sub-divided into intervals 
corresp. to responses the vocal duct to a series of impulses of vocal 
chord excitation and synchronous with the fundamental waveform 
frequency. The waveform in each interval is weighted, and the 
resulting signals are replaced with a replica shifted in time by an 
amount depending on prosodic information. The synthesis is performed 
by overlapping and adding the shifted signals... 

...left and right analysis edges. Two connecting functions are applied in 
turn, and each. interval of the synthesised signal is built by 
reproducing unchanged the waveform in the unchanging part of the 
original interval, and by aligning in time and adding the waveforms 
generated by the connecting functions . . . 

...USE/ADVANTAGE - Pref. for text -to-speech synthesis. Synthesis signal 

has more natural sound . 
...Title Terms: CONCATENATED ; 
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(c) 2000 Derwent Info Ltd. All rts . reserv. 
010563524 

WPI Acc No: 1996-060477/199607 

XRPX Acc No: N96-050445 

Text to speech system e.g. for workstation interaction, disabled 
person aid - controls operation of linguistic processor according to 
request signal from acoustic processor to process dispatcher indicating 
it is ready to process more speech segment from linguistic 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) ; IBM CORP (IBMC ) 

Inventor: SHARMAN R A 

Number of Countries: 005 Number of Patents: 005 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


GB 


2291571 
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GB 9414539 
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19940719 
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JP 
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19960202 


JP 95122096 
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19950522 


199615 


EP 


694904 
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EP 95301164 
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199814 


US 
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19980630 


US 94343304 


A 


19941122 


199833 



Priority Applications (No Type Date) : GB 9414539 A 19940719 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
GB 2291571 A 21 G10L-005/04 

EP 694904 A2 E 12 G10L-005/04 

Designated States (Regional) : DE FR GB 
JP 8030287 A 11 G10L-003/00 

EP 694904 A3 G10L-005/04 

US 5774854 A G10L-005/02 

International Patent Class (Main) : G10L-003/00; G10L-005/02; G10L-005/04 
International Patent Class (Additional) : G06F-003/16; G10L-009/00 

Text to speech system e.g. for workstation interaction, disabled 
person aid. . . 

...Abstract (Basic): The TTS (text to speech ) system converts input 

text into an output acoustic signal simulating natural speech . The 
system has a linguistic processor (210) for generating a listing of 
speech segments and associated parameters from the input text. An 
acoustic processor (220) generates the output acoustic waveform 
from this listing... 



36/3,IC,K/ll (Item 11 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts. reserv. 

008388294 

WPI Acc No: 1990-275295/199036 
XRPX Acc No: N90-212896 

Text to speech synthesis system - has parameter generator that 
converts formant allophone data derived from code book tables 

Patent Assignee: CENTIGRAM COMMUNICATIONS CORP (CENT-N) ; MAL SHEEN B J 

(MALS-I); SPEECH PLUS INC (SPEE-N) 
Inventor: GRONER G F; MALSHEEN B J; WILLIAMS L D; GRONER G; WILLIAMS L 
Number of Countries: 015 Number of Patents: 006 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

WO 9009657 A 19900823 199036 B 

US 4979216 A 19901218 US 89312692 A 19890217 199102 

EP 458859 A 19911204 EP 90903452 A 19900202 199149 
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Priority Applications (No Type Date) : US 89312692 A 19890217 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
WO 9009657 A 



Designated States (National) : CA JP 

Designated States (Regional): AT BE CH DE DK ES FR GB IT LU NL SE 
EP 458859 A 

Designated States (Regional): DE GB 
EP 458859 Bl E 30 G10L-005/04 Based on patent WO 9009657 

Designated States (Regional) : DE GB 
DE 69031165 E G10L-005/04 Based on patent EP 458859 

Based on patent WO 9009657 
International Patent Class (Main) : G10L-005/04 

International Patent Class (Additional) : G06F-015/34; G10L-005/00 



Text to speech synthesis system. . . 



. . .has parameter generator that converts formant allophone data derived 
from code book tables 



Abstract (Basic) : The text -to-speech synthesiser reads the text and 
uses the spelling to generate phonemes where appropriate, but uses a 
dictionary look-up where the spelling is misleading. The consonant 
allophones are generated in the usual way but the vowels also have 
their allophones chosen by their context. All known allophones for a 
given language are stored in... 

ADVANTAGE - By choosing vowel as well as formant allophones the 
synthetic speech is made to sound more natural . (50pp Dwg.No.7/11) 

Abstract (Equivalent) : A text -to-speech synthesis system, 
comprising: text conversion means (20, 22, 24) for converting a 
specified text string into a corresponding string of consonant and 
vowel phonemes (25), each the phoneme being selected from a predefined 
set of phonemes including a multiplicity of consonant phonemes and a 
multiplicity of vowel phonemes; parameter generating means (40) for 

generating speech parameters corresponding to the string of phonemes 
(25); and speech synthesising means (42) for generating a speech 

waveform corresponding to the speech parameters generated by the 
parameter generating means; characterised by: vowel allophone storage 
means (90, 130) storing a multiplicity of predefined vowel allophones, 
each vowel allophone being represented by a set of . . . 

and for then assigning to the vowel phoneme a selected one of the 
predefined vowel allophones corresponding to the computed phoneme 
context value; the parameter generating means (40) including means 
for generating speech parameters for the assigned vowel allophones... 

Abstract (Equivalent) : The text -to-speech conversion system has a 
parameter generator which converts the phonemes into formant 
parameters, and a formant synthesiser which uses the formant parameters 
to generate a synthetic speech waveform . A library of vowel 
allophones are stored each stroed vowel allophone being represented by 
formant parameters for four f ormants . The vowel allophone library 
includes a . . . 



...allophone with one or more pairs of phonemes preceding and following the 
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corresponding vowel phoneme in a phoneme string. When synthesising 
speech, a vowel allophone generator uses the vowel allophone library 
to provide formant parameters representative of a specified vowel 
phoneme. The vowel allophone generator coacts with the context index 
to select the proper vowel allophone, as determined by the phonemes 
preceding and following the specified vowel phoneme. ADVANTAGE - 
Synthesised. . . 
Title Terms: GENERATOR ; 
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008229149 

WPI Acc No: 1990-116150/199015 
XRPX Acc No: N90-089960 

Waveform addition-overlapping speech synthesis - using dictionary of 
diphone sound element derived by window analysis of speech signal 

Patent Assignee: FRANCE TELECOM (ETFR ); ETAT FR MIN PTT (ETFR ); MIN 

POSTS TELECOM & SPACE CENT NAT ETUD (ETFR ); HAMON C (HAMO-I) 
Inventor: HAMON C 

Number of Countries: 007 Number of Patents: 011 
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Waveform addition-overlapping speech synthesis... 

.Abstract (Equivalent) : replaced with a time shift thereof equal to a 
fundamental synthesis period, which is lesser than or greater than the 
original fundamental period, responsive to prosodic information 
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relating to the fundamental synthesis frequency, (c) synthesis is 
carried out by summing the thus shifted signals, characterised in that 
the method does not... 
...Abstract (Equivalent): element with a time shift thereof equal to the 
fundamental synthesis period, which is lesser than or greater than the 
original fundamental period responsive to prosodic information 
relative to the fundamental synthesis period, and. . . 

...c) summing the thus shifted signal to synthesize speech, said method 
being devoid of a modification of a pitch period of the speech sounds 
elements by spectral transformation between steps (a) and (b... 

...The process comprises supplying a sequence of phoneme codes and 

respective prosodic information, and, for each phoneme, analysing and 
synthesising each phoneme, and then concatenating the synthesized 
phonemes. For each phoneme, two diphones are selected among the stored 
diphones and the presence of voicing is determined. . . 

...For voiced phonemes, the respective waveforms of the two diphones 

constituting the phoneme are filtered by a window which is centered on 
a point of the selected waveform representative of the beginning of a 
pulse response of vocal cords to excitation. The window has a width 
substantially equal to twice the greater of... 

. . .USE - Speech synthesis process using diphones stored in a dictionary as 

waveforms , for text -to-speech conversion. . . 
Title Terms: WAVEFORM ; 



36/3,IC,K/13 (Item 13 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2000 Derwent Info Ltd. All rts . reserv. 

007865050 

WPI Acc No: 1989-130162/198917 
XRPX Acc No: N89-099196 

Generating speech from digitally stored co-articulated speech segments 

- recovering stored segments and concatenating in real time then 

applying data to sound generator 
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Priority Applications (No Type Date) : US 87107678 A 19871009 
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Generating speech from digitally stored co-articulated speech segments 



...recovering stored segments and concatenating in real time then 
applying data to sound generator 

. . .Abstract (Basic) : beginning, ending, and intermediate diphone sounds 
from the recorded syllables. Data samples are stored representing the 
extracted sounds in a digital memory device. A selected text to 
speech sequence of diphones required to generate a desired message 
is generated - 



...Stored data is recovered from the digital memory for each diphone in the 
selected sequence. The selected sequence of diphones is concatenated 
directly without any interpolation signals, in real time, using the 
recovered data. The concatenated diphone data is applied to a sound 
generating circuit to generate a desired message with a 3 KHz 
bandwidth. . . 

. . .ADVANTAGE - Quality speech is generated using a reduced amount of 

storage space and speech segments are joined in real time with smooth 
transitions required for quality speech. 

...Abstract (Equivalent): A method of generating speech using prerecorded 
real speech diaphones, said method comprising the steps of: digitally 
recording as PCM data samples spoken carrier syllables in which desired 
diaphones . . . 

...the PCM data samples representing desired beginning, ending and 

intermediate diaphones from the digitally recorded carrier syllables at 
a substantially common preselected location in the waveform of each 
diaphone; digitally compressing (27-85) the PCM samples of said 
diaphones using adaptive differential pulse code modulation to 
generate AD PCM encoded data; storing (77) the ADPCM encoded data 
representing said extracted digital diaphones in a digital memory 
devices (91); generating (95) a selected text to speech sequence 
of diaphones required to generate a desired message; recovering (115) 
stored ADPCM encoded data from said digital memory device (91) for each 
diaphone in said selected sequence of diaphones; reconstructing (123) 



Filing Notes 
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Div ex application AU 8825481 
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the PCM diaphone data samples from said recovered ADPCM encoded data; 

concatenating said reconstructed PCM diaphone data samples in said 
selected text to speech sequence of diaphones coarticulated speech 
segments directly, in real time; and applying (125) the concatenated 
reconstructed diaphone data samples to sound generating means 
(97-101) to generate said desired message; said method characterised 
by compressing the PCM data samples by generating (27, 31) a seed 
quantiser for the first data sample in each diaphone, by storing (29, 
33) the seed quantiser for the first data sample... 

Abstract (Equivalent) : are extracted from spoken carrier syllables and 
digitally compressed for storage using adaptive differential pulse code 
modulation (ADPCM) . Beginning seed quantization and PCM values are 

generated for each coarticulated speech segment and stored together 
with the ADPCM encoded data in a coarticulated speech segment library 



. . .ADPCM encoded data are recovered from the coarticulated speech segment 
library and blown back using the initial quantization and PCM seed 
values. This reconstructs and concatenates in real time the sequence 
of coarticulated speech segments required by a text to speech 
program to generate a desired high quality spoken message. Pref . the 
coarticulated speech segments are diphones . . . 

...USE - Generating quality speech from prerecorded digitally stored 
spoken speech segments in library in real time. Reduced memory 
requirements. (Dwg. 10/10 

Title Terms : GENERATE ; 
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05866088 
TEXT VOICE SYNTHESIZER 



PUB. NO. : 
PUBLISHED: 
INVENTOR (s) 



APPLICANT (s) 

APPL. NO. : 
FILED: 
INTL CLASS: 



10-149188 [JP 10149188 A] 
June 02, 1998 (19980602) 
FUJIMOTO HIROYUKI 
YAMATO TOSHITAKA 
ISHIKAWA OS AMU 

FUJITSU TEN LTD [421134] (A Japanese Company or Corporation), 
JP (Japan) 

08-310890 [JP 96310890] 
November 21, 1996 (19961121) 
[6] G10L-003/00; G10L-005/02 



TEXT VOICE SYNTHESIZER 



ABSTRACT 

PROBLEM TO BE SOLVED: To form almost natural voice synthesization 
concerning limited sentence examples... 

...SOLUTION: Concerning a text voice synthesizer for regularly 

synthesizing arbitrary sentences in voice, this device is provided with a 
word dictionary part 62 storing a lot of words and having identification 
attributes in partial... 

. . . control part 65 for collating a word string provided from a language 
processing analytic part 63 with the sentence example pattern and 
controlling inserted voice synthesization or regular synthesization . A 
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device for performing the inserted voice synthesization is provided with 
an insert table 73 with intonation composed of conjugation for 
conjugating plural words and the intonations of inserted word strings and 
an inserted rhythm generating part 74 for generating a pitch pattern 
while using the intonation of inserted word string and connecting a 
waveform for the unit of a voice according to this pitch pattern, 

36/3,IC,K/l5 (Item 2 from file: 347) 

DIALOG (R) File 347:JAPIO 

(c) 2000 JPO & JAPIO. All rts. reserv. 

05814190 

SPEECH SYNTHESIZER 

PUB. NO.: 10-097290 [JP 10097290 A] 

PUBLISHED: April 14, 1998 (19980414) 
INVENTOR(s): NISHIDA HIDEJI 

HIRAI HIROYUKI 

MIYATAKE MASANORI 

ONISHI HIROKI 

APPLICANT(s) : SANYO ELECTRIC CO LTD [000188] (A Japanese Company or 

Corporation), JP (Japan) 
APPL. NO. : 08-251646 [JP 96251646] 
FILED: September 24, 1996 (19960924) 

INTL CLASS: [6] G10L-005/04; G10L-003/00 

SPEECH SYNTHESIZER 

ABSTRACT 

PROBLEM TO BE SOLVED: To output a synthesized speech waveform of 
superior speech quality by reading an optimum unit speech waveform 
corresponding to a 1st vocal sound symbol part string divided in specific 
preferential order out of a waveform memory and connecting it... 

...SOLUTION: A text speech synthesizer 10 includes a microcomputer 
12. The microcomputer 12 receives an input character string consisting of a 
1st vocal sound symbol string consisting of text document... 

. . . dictionary 14 for text analysis to convert it into a vocal sound symbol 
string consisting of the 1st vocal sound symbol part string and also 

generate the pitch pattern and power pattern of this input character 
string. Then the microcomputer 12 shapes, connects, and edits unit speech 
waveforms registered in a speech waveform data base 16 according to the 
pitch pattern and power pattern, and outputs the resulting synthesized 
speech. Language information corresponding to vocal sound symbols of a 2nd 
vocal sound symbol string which is divided in specific preferential order 
is added to . . . 

36/3,IC,K/l6 (Item 3 from file: 347) 

DIALOG (R) File 347: JAPIO 

(c) 2000 JPO & JAPIO. All rts. reserv. 

05279284 

HARMONY GENERATING DEVICE 

PUB. NO.: 08-234784 [JP 8234784 A] 

PUBLISHED: September 13, 1996 (19960913) 
INVENTOR (s) : KAGEYAMA YASUO 

MATSUMOTO SHUICHI 
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APPLICANT (s) : YAMAHA CORP [000407] (A Japanese Company or Corporation), JP 
(Japan) 

APPL. NO. : 07-041767 [JP 9541767] 
FILED: March 01, 1995 (19950301) 

INTL CLASS: [6] G10K-015/04; G10H-001/38 

HARMONY GENERATING DEVICE 

ABSTRACT 

PURPOSE: To provide a KARAOKE device which generates a harmony voice 
signal even unless the pitch of a text voice signal is detected... 

... sound volume detection part 43, and a multiplier 45. The singing voice 
signal is multiplied by a window function through the multiplier 45 and cut 
waveform element data of one cycle are stored in a memory 46. A readout 
control part 48 for harmony data accesses the memory 46 and the signal 
obtained by repeatedly reading waveform element data out at intervals 
corresponding to a harmony frequency is the harmony voice signal. The 
window function is one cycle long in terms of... 

. . . window function so controlled that the peak detected by the peak 
detection part 41 is at the center of the window function. A window 
function generation part 44 cuts the waveform element data at intervals 
of tens of ms and waveform element data corresponding to a timbre are 
written in the memory 46: when phonemes change, a phoneme detection part 42 
transmits that to the window function generation part 44 to generate 
the window function. 

36/3,IC,K/17 (Item 4 from file: 347) 

DIALOG (R) File 347:JAPIO 

(c) 2000 JPO & JAPIO. All rts. reserv. 

03708597 

DEVICE AND METHOD FOR SYNTHESIZING SOUND RULE 

PUB. NO. : 04-073697 [JP 4073697 A] 

PUBLISHED: March 09, 1992 (19920309) 
INVENTOR (s): TAKEDA SHOICHI 

ASAKAWA YOSHIAKI 

ICHIKAWA HIROSHI 

APPLICANT (s) : HITACHI LTD [000510] (A Japanese Company or Corporation), JP 
(Japan) 

APPL. NO.: 02-183947 [JP 90183947] 
FILED: July 13, 1990 (19900713) 

INTL CLASS: [5] G10L-005/00; G10L-003/00 

JOURNAL: Section: P, Section No. 1374, Vol. 16, No. 276, Pg. 165, June 

19, 1992 (19920619) 

DEVICE AND METHOD FOR SYNTHESIZING SOUND RULE 

ABSTRACT 

PURPOSE: To realize increased or decreased intensity included in natural 

text voice vocalized by a person in rule synthesis by synthesizing 

voice sequentially by a phoneme parameter string in accordance with an 
input text and the time- changed pattern (pitch pattern) of a fundamental 
frequency. . . 

...CONSTITUTION: A control parameter generating part 3 decides accent, 
intonation , phoneme duration , and a sound source power (amplitude) 
correction value by a rule, and generates the pitch pattern and a 
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phoneme parameter time series according to them. Generated fundamental 
frequency and phoneme parameter are sent to a voice synthesis part 4 
sequentially, and a voice waveform is outputted. Thereby, since rythm 
control by a prominence generation rule is found based on the 
quantitative analysis of natural voice, natural increased or decreased 
intensity looking like a human being can be supplied to the voice 
synthesized from an input document (text ) . 
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